jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.8k stars 3.39k forks source link

ICML writer: Write HyperlinkTextDestination for Div elements #6965

Open nathan-artist opened 3 years ago

nathan-artist commented 3 years ago

The changes of @lrosenthol that implemented internal document links in the ICML writer a few months ago (addressing issue #5541) work great in general. But I have encountered a problem: When writing ICML using --citeproc --metadata=link-citations:true (using latest pandoc 2.11.3.1) to render inline citations linked to references in the reference list, the HyperlinkTextSource and Hyperlink elements for inline citations are properly written, but the HyperlinkTextDestination element is missing for each item in the reference list. So linked citations do not work.

The missing HyperlinkTextDestination elements correspond to Div elements in pandoc's AST, so I suspect this problem is directly related to a comment by @lrosenthol in PR #6606:

I've implemented proper support for named destinations/links/anchors (whatever they are called) for most of the standard types. I didn't do it for images and special divs - and will file a separate issue for those.

I am going to try to write my own filter as a workaround to write the missing HyperlinkTextDestination elements, but it would be better if someone could implement this in the ICML writer, so I opened this issue hoping that someone would care to do it.

nathan-artist commented 3 years ago

In case it helps anyone, I thought I would share the Lua filter that I wrote as a workaround to write HyperlinkTextDestination elements for references in a pandoc-citeproc-generated reference list so that linked citations work. This Lua filter inserts a Span element into Pandoc's AST with the pandoc-citeproc cite identifier (with a local function to percent-encode it, since it needs to be percent-encoded to match the Hyperlink element but the ICML writer strangely doesn't do the percent encoding) around the first inline element in a reference list entry. I don't know if this is the best solution but it worked for me.

function Div (el)
  if el.classes[1] == "csl-entry" then
    local cite_id = string.gsub(el.identifier, "([&=:;+%c])", function (c)
      return string.lower(string.format("%%%02X", string.byte(c)))
      end)
    return pandoc.walk_block(el, {
      Para = function(el)
        el.content[1] = pandoc.Span(el.content[1], {id = cite_id})
        return el
      end })
  end
end
nmueller18 commented 3 years ago

Great filter! I had the same problem, and your script seems to be a solution. Perhaps not the most elegant, but working. Thank you very much!

lealbaugh commented 2 years ago

Thanks @nathan-artist! I made a slight modification for anyone using a citation style that starts with a nested span (e.g. the ACM style, in my case) -- in this case, the id has to be applied to the first child element, apparently:

function Div (el)
  if el.classes[1] == "csl-entry" then
    local cite_id = string.gsub(el.identifier, "([&=:;+%c])", function (c)
      return string.lower(string.format("%%%02X", string.byte(c)))
      end)
    return pandoc.walk_block(el, {
      Para = function(paraEl)
        if paraEl.content[1].attr then
          paraEl.content[1].attr.identifier = cite_id
        else
          paraEl.content[1] = pandoc.Span(paraEl.content[1], {id = cite_id})
        end
        return paraEl
      end })
  end
end