jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.65k stars 3.38k forks source link

A br inside a strong tag gets converted incorrectly #6974

Open Barsonax opened 3 years ago

Barsonax commented 3 years ago

When converting html to markdown inserting a <br> inside a <strong> results in incorrect markdown:

Input

<strong>OK zu nicht bestätigtem Fehler <br></strong>

Output

**OK zu nicht bestätigtem Fehler  
**

Correct Output

**OK zu nicht bestätigtem Fehler**

https://pandoc.org/try/?text=%3Cstrong%3EOK+zu+nicht+best%C3%A4tigtem+Fehler+%3Cbr%3E%3C%2Fstrong%3E&from=html&to=gfm&standalone=0

version

PS C:\git> pandoc --version
pandoc.exe 2.10.1
Compiled with pandoc-types 1.21, texmath 0.12.0.2, skylighting 0.8.5
Default user data directory: C:\Users\damric\AppData\Roaming\pandoc
Copyright (C) 2006-2020 John MacFarlane
Web:  https://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.
mb21 commented 3 years ago

Seems like a problem in the commonmark/gfm writer... as it works for -t markdown, right?

Barsonax commented 3 years ago

Seems like a problem in the commonmark/gfm writer... as it works for -t markdown, right?

Also seems to be broken: https://pandoc.org/try/?text=%3Cstrong%3EOK+zu+nicht+best%C3%A4tigtem+Fehler+%3Cbr%3E%3C%2Fstrong%3E&from=html&to=markdown&standalone=0

mb21 commented 3 years ago

but that's correct, backslash followed by newline is pandoc-flavoured markdown for <br>, see https://pandoc.org/MANUAL.html#backslash-escapes

Barsonax commented 3 years ago

but that's correct, backslash followed by newline is pandoc-flavoured markdown for <br>, see https://pandoc.org/MANUAL.html#backslash-escapes

But it causes the closing ** to end up the wrong line right?

It should be like this I believe:

Wrong:

**OK zu nicht bestätigtem Fehler\
**

Correct

**OK zu nicht bestätigtem Fehler**\
mb21 commented 3 years ago

well.. that would be kind of nicer for a human to read, but it's not exactly the same content anymore...

**OK zu nicht bestätigtem Fehler\
**

converts perfectly fine back to the same HTML:

<p><strong>OK zu nicht bestätigtem Fehler<br />
</strong></p>
Barsonax commented 3 years ago

Ok then it seems only the github flavored version is broken: https://pandoc.org/try/?text=**OK+zu+nicht+best%C3%A4tigtem+Fehler%0A**&from=gfm&to=html5&standalone=0