pandoc / lua-filters

A collection of lua filters for pandoc
MIT License
600 stars 165 forks source link

broken crossref since pandoc 3 in interaction with diagram-generator #371 #261

Closed piccolbo closed 1 year ago

piccolbo commented 1 year ago

Since upgrading to pandoc 3 all and only the refs to diagram-generator-generated figs are broken. I reported the issue first to pandoc-crossref and provided a MRE there

https://github.com/lierdakil/pandoc-crossref/issues/371

Not sure where the problem is exactly, but pandoc and pandoc-crossref have changed since the last time it worked, diagram-generator hasn't.

Thanks!

jgm commented 1 year ago

The program is that diagram-generator is assuming that a figure will be a Para with an Image in it, but in pandoc 3.0 a figure is a Figure element. Whoever is maintaining the filter will need to update it. (Ideally it could behave differently depending on the pandoc-types version, but if not, it should support the latest presumably.)

piccolbo commented 1 year ago

Thanks for the analysis.

piccolbo commented 1 year ago

I've seen @tarleb active on this filter. I see that a pandoc.Para(pandoc.Image(...)) is returned at line 389. Since there isn't a pandoc.Figure that I can see in the docs, do we need to just drop the pandoc.Para()?

jgm commented 1 year ago

Ah, that's an omission in the lua-filter docs which we should fix, @tarleb. pandoc.Figure does exist. You can do something like this:

pandoc.Figure({pandoc.Plain{pandoc.Image("alt text","img.jpg")}}, {"caption"}, {id="foo"})
jgm commented 1 year ago

you can use the global PANDOC_API_VERSION to check the pandoc-types version.

if PANDOC_API_VERSION[1] >= 1 and PANDOC_API_VERSION[2] >= 23 then
  -- use Figure
else
  -- do what you did before
end
tarleb commented 1 year ago

I've pushed a quick fix; let me know if it doesn't work.

tarleb commented 1 year ago

@jgm: I don't know how I missed that, I'll fix it.

When comparing Version objects, pandoc tries to convert the other value to a version as well, so we can do things like

if PANDOC_API_VERSION >= '1.23' and
   PANDOC_VERSION >= 3 then
  ...
end
piccolbo commented 1 year ago

I've run into a problem, while it seems to be overall working. It doesn't treat the width attribute quite right, meaning that the epub output doesn't pass epubcheck with rules 3.3 (it did with prev versions). The error is

ERROR(RSC-005): output/book.epub/EPUB/text/ch002.xhtml(612,40): Error while parsing file: attribute "width" not allowed here;

and the incriminating lines are (starting at 612)

<figure id="fig:ancestry" width="100%">
<img src="../media/file3.png" style="width:100.0%" alt="A portion of the ancestry of Charles II of Spain." />
<figcaption><p>Figure 4: A portion of the ancestry of Charles II of Spain.<a href="#fn49" class="footnote-ref" id="fnref49" epub:type="noteref">49</a></p></figcaption>
</figure>
piccolbo commented 1 year ago

The markdown is:

{.graphviz #fig:ancestry caption="A portion of the ancestry of Charles II of Spain.^[Names are disambiguated where needed with the year of birth.]" width=100%}
digraph {

Removing the width attribute resolves the error.

tarleb commented 1 year ago

I can't reproduce this. Can you post a full example, incl the pandoc command used for the conversion?

piccolbo commented 1 year ago

The epub displays normally in Apple books or calibre viewer. But passing epubcheck is important, for instance for submission to Kindle Direct Publishing (aka amazon). What epubchek is complaining about AFAICT is a width attr. inside the figure tag. There is also a width attr inside a styles attr in the following image tag, an apparent duplication. I hope this enables you to repro & fix but please let me know if I can help.

MRE

---
title: "The"
---

@fig:cyclomatic.

```{.graphviz #fig:cyclomatic caption="Graphs corresponding to `if` and `while` statements resp." width=100%}
graph {
  if -- then; if--else; then -- "end if" ; else -- "end if" ;
  }
```

cmdline and output (please replace your location for diagram-generator.lua):

pandoc  --embed-resources --standalone  --lua-filter=filters/diagram-generator-new.lua  --filter pandoc-crossref  diagram-test.md -o diagram-test.epub
antonio@... > epubcheck diagram-test.epub 
Validating using EPUB version 3.3 rules.
ERROR(RSC-005): diagram-test.epub/EPUB/text/ch001.xhtml(16,42): Error while parsing file: attribute "width" not allowed here; expected attribute "about", "accesskey", "aria-activedescendant", "aria-atomic", "aria-autocomplete", "aria-busy", "aria-checked", "aria-colcount", "aria-colindex", "aria-colspan", "aria-controls", "aria-current", "aria-describedby", "aria-description", "aria-details", "aria-disabled", "aria-dropeffect", "aria-errormessage", "aria-expanded", "aria-flowto", "aria-grabbed", "aria-haspopup", "aria-hidden", "aria-invalid", "aria-keyshortcuts", "aria-label", "aria-labelledby", "aria-level", "aria-live", "aria-modal", "aria-multiline", "aria-multiselectable", "aria-orientation", "aria-owns", "aria-posinset", "aria-pressed", "aria-readonly", "aria-relevant", "aria-required", "aria-roledescription", "aria-rowcount", "aria-rowindex", "aria-rowspan", "aria-selected", "aria-setsize", "aria-sort", "aria-valuemax", "aria-valuemin", "aria-valuenow", "aria-valuetext", "autocapitalize", "autofocus", "class", "content", "contenteditable", "datatype", "dir", "draggable", "epub:type", "hidden", "inlist", "inputmode", "is", "itemid", "itemprop", "itemref", "itemscope", "itemtype", "lang", "nonce", "ns:alphabet", "ns:ph", "onabort", "onauxclick", "onblur", "oncancel", "oncanplay", "oncanplaythrough", "onchange", "onclick", "onclose", "oncontextmenu", "oncopy", "oncuechange", "oncut", "ondblclick", "ondrag", "ondragend", "ondragenter", "ondragleave", "ondragover", "ondragstart", "ondrop", "ondurationchange", "onemptied", "onended", "onerror", "onfocus", "onfocusin", "onfocusout", "onformdata", "oninput", "oninvalid", "onkeydown", "onkeypress", "onkeyup", "onload", "onloadeddata", "onloadedmetadata", "onloadstart", "onmousedown", "onmouseenter", "onmouseleave", "onmousemove", "onmouseout", "onmouseover", "onmouseup", "onpaste", "onpause", "onplay", "onplaying", "onprogress", "onratechange", "onreset", "onresize", "onscroll", "onsecuritypolicyviolation", "onseeked", "onseeking", "onselect", "onslotchange", "onstalled", "onsubmit", "onsuspend", "ontimeupdate", "ontoggle", "ontransitioncancel", "ontransitionend", "ontransitionrun", "ontransitionstart", "onvolumechange", "onwaiting", "onwheel", "prefix", "property", "rel", "resource", "rev", "role", "slot", "spellcheck", "style", "tabindex", "title", "translate", "typeof", "vocab", "xml:base", "xml:lang" or "xml:space" (with xmlns:ns="http://www.w3.org/2001/10/synthesis")

Check finished with errors
Messages: 0 fatals / 1 error / 0 warnings / 0 infos

EPUBCheck completed
jgm commented 1 year ago

We should change things so that width and height attributes aren't moved from the image to the figure. I think this concerns the implicit_figures implementation in the Markdown reader. Could you open an issue on jgm/pandoc for this?

tarleb commented 1 year ago

Thank you for the example. It seems that pandoc-crossref is adding the "width" attribute to figure. So this should be reported at https://github.com/lierdakil/pandoc-crossref/issues

As a workaround, you can add another filter that removes the attribute:

function Figure (fig)
  fig.attributes.width = nil
  return fig
end

Make sure it runs after pandoc-crossref.