Open xrat opened 2 years ago
for what it's worth, the following works:
Term
: Def
<!-- : comment def
-->
not sure this is a bug... just an edge case in the pandoc's markdown syntax...
@mb21 your suggested workaround is not feasible for larger comments b/c any line starting with :
triggers the bug. IMHO this warrants the label bug.
FWIW, multimarkdown and kramdown give the same result as pandoc.
Pandoc's CommonMark parser, with the definition_lists
extension enabled, behaves more like you expect; try with --from=commonmark_x
.
This happens because the way the def list parser works is by gobbling up raw lines comprising the definition, and then parsing after the fact. This method isn't sophisticated enough to skip a multiline HTML comment. Here's another case worth considering.
Term
: Def
test <!--
: comment def
and -->
Note that this case is parsed by the commonmark+definition_list parser as
<dl>
<dt>Term</dt>
<dd>Def test <!–
</dd>
<dd>comment def and –>
</dd>
</dl>
which is correct given the commonmark principle that block-level structure takes precedence over inline-level structure.
This doesn't just happen with HTML comments. It can also occur with other inline elements like inline code. The output makes sense based on the parsing algorithm, but it is somewhat surprising from a user perspective to discover that the validity of inline code depends on line break locations.
Here's an example with inline code:
Term
: Def
`code
: comment def
more code`
Pandoc Markdown produces this:
<dl>
<dt>Term</dt>
<dd>Def `code
</dd>
<dd>comment def more code`
</dd>
</dl>
And CommonMark (-f commonmark+definition_lists
) gives the same thing. Simply removing the line break before : comment
results in valid inline code.
If you indent your definitions properly, you're less likely to run into problems like this:
Term
: Def
`code
: comment def
more code`
versus
Term
: Def
`code
: comment def
more code`
I am very thankful for Pandoc and its contributors. So, please excuse me asking for a clarification why in this case it is acceptable that an HTML comment of type <!-- (...) -->
is not parsed as a comment by Pandoc's Markdown whereas I do not know any other such case.
I don't think the issue should have been closed.
Same here with list:
* foo
<!--
* bar
-->
* baz
* baz
<!--
* qux
-->
* end
$ pandoc --from markdown --to html5 list-with-comment.md
<ul>
<li>foo <!--
* bar
--></li>
<li>baz
<ul>
<li>baz <!–</li>
</ul></li>
<li>qux –></li>
<li>end</li>
</ul>
In the following minimal example the HTML comment
<!-- (…) -->
is falsely not recognized as a comment:Pandoc v2.16.2 with
pandoc --from markdown --to html5
producespandoc --strip-comments
produces the same output as above. A workaround is to put any character in front of the commented:
. In other words, the commented:
(or a~
) is what triggers the bug.