tomduck / pandoc-tablenos

A pandoc filter for numbering tables and table references.
GNU General Public License v3.0
108 stars 8 forks source link

Table IDs are duplicated in HTML output on the <div> and <table> elements #31

Open dhimmel opened 3 years ago

dhimmel commented 3 years ago

Using pandoc 2.14 with pandoc-tablenos 2.3.0, we've noticed that the table id occurs in the output HTML twice.

Example source markdown

| *Bowling Scores* | Jane          | John          | Alice         | Bob           |
|:-----------------|:-------------:|:-------------:|:-------------:|:-------------:|
| Game 1 | 150 | 187 | 210 | 105 |
| Game 2 |  98 | 202 | 197 | 102 |
| Game 3 | 123 | 180 | 238 | 134 |

Table: A table with a top caption and specified relative column widths.
{#tbl:bowling-scores}

Gets converted to

<div id="tbl:bowling-scores" class="tablenos" data-collapsed="false">
<table id="tbl:bowling-scores">
<!-- table contents here -->
</table>
</div>

Notice how id="tbl:bowling-scores" is repeated, which violates the recommendation that element IDs should be unique within an entire HTML document.

My guess is that Pandoc has added support for the {#tbl:bowling-scores} syntax for defining a table ID. I think the relevant updates are https://github.com/jgm/pandoc/issues/6317 and https://github.com/jgm/pandoc/commit/871164051281b50a5b4b28cacee3dd15344d81f1. Based on the commit, this issue might be present since Pandoc 2.11, but haven't checked versions other than 2.14.

Would there be an easy way to know whether pandoc is going to passthrough the id and if so to skip the div?

frederik-elwert commented 3 years ago

From what I understand, pandoc does not support the corresponding markdown syntax itself, but since 2.10 has the required internal data structure which pandoc-tablenos then populates. And I agree that for recent versions of pandoc, pandoc-tablenos should omit the id attribute from the div, since it is not necessary anymore (and actually invalid).

dhimmel commented 3 years ago

Ah thanks @frederik-elwert for that clarification. So pandoc-tablenos now populates the new AST format defined by pandoc-types 1.21 since https://github.com/tomduck/pandoc-tablenos/commit/f99468d3d178b326da43c4da664dcc2d0f46f976. And that sets the id attribute on tables which is then supported by the Pandoc HTML writer?

frederik-elwert commented 3 years ago

Yes, that’s how I understand it. HTML readers and writers already support the new table attributes, there is just not yet an official Markdown syntax for it. So pandoc-tablenos parses the provisional syntax from the caption as before, but uses the native AST to store these attributes.