r-lib / commonmark

High Performance CommonMark and Github Markdown Rendering in R
https://docs.ropensci.org/commonmark/
Other
88 stars 11 forks source link

footnote XML output contains `<<unknown>>` tags #23

Open zkamvar opened 1 year ago

zkamvar commented 1 year ago

(edit: not sure how the reprex lost its formatting, but I fixed it)

The new footnotes feature might be useful for us at {tinkr}, but I'm not sure how to parse them as each footnote contains identical tags.

from: https://github.com/ropensci/tinkr/issues/92#issuecomment-1479684407

txt <- c("a statement[^1][^2]\n", "[^1]: this is true", "[^2]: this is false")
commonmark::markdown_xml(txt, footnotes = TRUE) |> writeLines()
#> <?xml version="1.0" encoding="UTF-8"?>
#> <!DOCTYPE document SYSTEM "CommonMark.dtd">
#> <document xmlns="http://commonmark.org/xml/1.0">
#>   <paragraph>
#>     <text xml:space="preserve">a statement</text>
#>     <<unknown> />
#>     <<unknown> />
#>   </paragraph>
#>   <<unknown>>
#>     <paragraph>
#>       <text xml:space="preserve">this is true</text>
#>     </paragraph>
#>   </<unknown>>
#>   <<unknown>>
#>     <paragraph>
#>       <text xml:space="preserve">this is false</text>
#>     </paragraph>
#>   </<unknown>>
#> </document>

Created on 2023-03-22 with reprex v2.0.2

jeroen commented 1 year ago

Ew that's weird. Posted it upstream here: https://github.com/github/cmark-gfm/issues/316

zkamvar commented 5 months ago

I've added a potential fix for this in https://github.com/github/cmark-gfm/pull/362, but I'm not sure how active the maintainers are there.

yihui commented 2 days ago

I feel there is no hope that they would ever merge the fix upstream. @jeroen Do you think it will be easy enough to apply @zkamvar's patch to the commonmark R package here? Of course, I mean in a maintainable way (e.g., git apply after pulling cmark-gfm), not just as a one-time manual job.