lierdakil / pandoc-crossref

Pandoc filter for cross-references
https://lierdakil.github.io/pandoc-crossref/
GNU General Public License v2.0
936 stars 74 forks source link

Incorrect width attr added to figure tag when cross-referencing figure #373

Closed piccolbo closed 1 year ago

piccolbo commented 1 year ago

In the process of fixing #371 which indeed was not a pandoc-crossref problem, we discovered another one that appears to be, see https://github.com/pandoc/lua-filters/issues/261. @jgm initially said it appeared to be a pandoc problem, but @tarleb after additional analysis said that pandoc-crossref was more likely. I can confirm that the error disappears if pandoc-crossref is not used, same MRE. A width attr is added to a figure tag, which is incorrect according to epubcheck rules 3.3 (the current one). I am on the latest of everything, mac os ventura. Thanks

MRE

---
title: "The"
---

@fig:cyclomatic.

```{.graphviz #fig:cyclomatic caption="Graphs corresponding to `if` and `while` statements resp." width=35%}
graph {
  if -- then; if--else; then -- "end if" ; else -- "end if" ;
  }
```

cmdline:

pandoc  --embed-resources --standalone  --lua-filter=diagram-generator.lua  --filter pandoc-crossref  diagram-test.md -o diagram-test.epub

epubcheck out:

Validating using EPUB version 3.3 rules.
ERROR(RSC-005): diagram-test.epub/EPUB/text/ch001.xhtml(16,41): Error while parsing file: attribute "width" not allowed here; expected attribute "about", "accesskey", "aria-activedescendant", "aria-atomic", "aria-autocomplete", "aria-busy", "aria-checked", "aria-colcount", "aria-colindex", "aria-colspan", "aria-controls", "aria-current", "aria-describedby", "aria-description", "aria-details", "aria-disabled", "aria-dropeffect", "aria-errormessage", "aria-expanded", "aria-flowto", "aria-grabbed", "aria-haspopup", "aria-hidden", "aria-invalid", "aria-keyshortcuts", "aria-label", "aria-labelledby", "aria-level", "aria-live", "aria-modal", "aria-multiline", "aria-multiselectable", "aria-orientation", "aria-owns", "aria-posinset", "aria-pressed", "aria-readonly", "aria-relevant", "aria-required", "aria-roledescription", "aria-rowcount", "aria-rowindex", "aria-rowspan", "aria-selected", "aria-setsize", "aria-sort", "aria-valuemax", "aria-valuemin", "aria-valuenow", "aria-valuetext", "autocapitalize", "autofocus", "class", "content", "contenteditable", "datatype", "dir", "draggable", "epub:type", "hidden", "inlist", "inputmode", "is", "itemid", "itemprop", "itemref", "itemscope", "itemtype", "lang", "nonce", "ns:alphabet", "ns:ph", "onabort", "onauxclick", "onblur", "oncancel", "oncanplay", "oncanplaythrough", "onchange", "onclick", "onclose", "oncontextmenu", "oncopy", "oncuechange", "oncut", "ondblclick", "ondrag", "ondragend", "ondragenter", "ondragleave", "ondragover", "ondragstart", "ondrop", "ondurationchange", "onemptied", "onended", "onerror", "onfocus", "onfocusin", "onfocusout", "onformdata", "oninput", "oninvalid", "onkeydown", "onkeypress", "onkeyup", "onload", "onloadeddata", "onloadedmetadata", "onloadstart", "onmousedown", "onmouseenter", "onmouseleave", "onmousemove", "onmouseout", "onmouseover", "onmouseup", "onpaste", "onpause", "onplay", "onplaying", "onprogress", "onratechange", "onreset", "onresize", "onscroll", "onsecuritypolicyviolation", "onseeked", "onseeking", "onselect", "onslotchange", "onstalled", "onsubmit", "onsuspend", "ontimeupdate", "ontoggle", "ontransitioncancel", "ontransitionend", "ontransitionrun", "ontransitionstart", "onvolumechange", "onwaiting", "onwheel", "prefix", "property", "rel", "resource", "rev", "role", "slot", "spellcheck", "style", "tabindex", "title", "translate", "typeof", "vocab", "xml:base", "xml:lang" or "xml:space" (with xmlns:ns="http://www.w3.org/2001/10/synthesis")

Check finished with errors
Messages: 0 fatals / 1 error / 0 warnings / 0 infos

EPUBCheck completed

Offending lines (line 16 is the one with the opening figure tag):

<body epub:type="bodymatter">
<section id="the" class="level1 unnumbered">
<h1 class="unnumbered">The</h1>
<p>fig. 1.</p>
<figure id="fig:cyclomatic" width="35%">
<img src="../media/file0.svgz" style="width:35.0%" alt="Graphs corresponding to if and while statements resp." />
<figcaption><p>Figure 1: Graphs corresponding to <code>if</code> and <code>while</code> statements resp.</p></figcaption>
</figure>
</section>
</body>
lierdakil commented 1 year ago

That's certainly a bug in pandoc-crossref, but maybe not exactly in the way you expect. This is interesting, actually. What do you expect to happen when you set width on an implicit figure to 35%? Because pandoc will only set width on the image itself, not on the containing figure, which is a block element and thus has width implicitly set to 100% of the page.

piccolbo commented 1 year ago

I expect it not to add a width attribute to figure, which results in an epubcheck error when generating epub, which means I can't submit a book to Amazon kindle without editing the output. In the browser it doesn't seem to generate much harm. I am not a front end person, but I surmise from https://www.html.am/tags/html-figure-tag.cfm that the width attribute is just plain illegal there. If I don't run pandoc-crossref, this is the output:

<body epub:type="bodymatter">
<section id="the" class="level1 unnumbered">
<h1 class="unnumbered">The</h1>
<p><span class="citation" data-cites="fig:cyclomatic">@fig:cyclomatic</span>.</p>
<figure id="fig:cyclomatic">
<img src="../media/file0.svgz" style="width:35.0%" alt="Graphs corresponding to if and while statements resp." />
<figcaption aria-hidden="true"><p>Graphs corresponding to <code>if</code> and <code>while</code> statements resp.</p></figcaption>
</figure>
</section>
</body>

The expected output i think should be:

<body epub:type="bodymatter">
<section id="the" class="level1 unnumbered">
<h1 class="unnumbered">The</h1>
<p>fig. 1.</p>
<figure id="fig:cyclomatic">
<img src="../media/file0.svgz" style="width:35.0%" alt="Graphs corresponding to if and while statements resp." />
<figcaption><p>Figure 1: Graphs corresponding to <code>if</code> and <code>while</code> statements resp.</p></figcaption>
</figure>
</section>
</body>

I am not an HTML expert so I am guessing a little, but I tried the above and it passes epubcheck and renders the way I intend it to (the image occupies 35% of the width of the page).

lierdakil commented 1 year ago

Okay, what I was asking how do you expect it to render exactly, no need to be standoffish.

Why I ask is because rendering an image at 35% width is not what I would expect, my intuition rather says a figure should rather be 35% width (which would be useful if you had figure { float: left; } in your stylesheet or something)

Anyway, either way you look at it, it's a bug. For now, I've pushed a commit that disables copying image attributes to the wrapping figure element. If you need this ASAP, feel free to grab build artefacts from CI once that (hopefully) finishes. I'll figure out a release once I deal with other recent bugreports.