kivikakk / cmark-gfm-hs

Haskell bindings to libcmark-gfm GitHub Flavored Markdown parser
Other
13 stars 6 forks source link

[BUG] Incorrect position of one-line HTML comment tags #23

Open Sereja313 opened 2 years ago

Sereja313 commented 2 years ago

There seems to be some kind of bug with the single-line HTML comment tags. The values ​​of the endLine and endColumn fields are always incorrect for them.

Here is an example:

commonmarkToNode [] [] $ fromString "<!-- comment -->"

produces

Node (Just (PosInfo {startLine = 1, startColumn = 1, endLine = 1, endColumn = 16})) DOCUMENT [Node (Just (PosInfo {startLine = 1, startColumn = 1, endLine = 0, endColumn = 0})) (HTML_BLOCK "<!-- comment -->\n") []]
kivikakk commented 2 years ago

Heya! This library doesn't do much more than wrap, so I suspected this might come from the underlying cmark-gfm, which does seem to be the case:

$ printf '<!-- comment -->' | cmark-gfm --unsafe -t xml --sourcepos
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document sourcepos="1:1-1:16" xmlns="http://commonmark.org/xml/1.0">
  <html_block sourcepos="1:1-0:0" xml:space="preserve">&lt;!-- comment --&gt;
</html_block>
</document>

I suspect this case is getting hit here and it's not correctly adjusting for the free-content nature of inline HTML blocks. There's a possibly related bug mentioned upstream: https://github.com/commonmark/cmark/pull/298

kivikakk commented 2 years ago

This PR also looks promising (see item number 2 in the PR body): https://github.com/github/cmark-gfm/pull/210

Sereja313 commented 2 years ago

Oh I see. Thank you very much for your response!