trentm / python-markdown2

markdown2: A fast and complete implementation of Markdown in Python
Other
2.66k stars 433 forks source link

HTML elements are replaced by hashes #508

Closed invisibleroads closed 1 year ago

invisibleroads commented 1 year ago

Python 3.10.8 markdown2==2.4.8

Code

x = '''
<div><div></div>
</div>
<div></div>
<div></div>
<div></div>

- A
'''
from markdown2 import markdown
print(markdown(x))

Output

<div><div></div>

</div>

md5-516b9f5b6ec51f43a8ff878841e574fa

md5-516b9f5b6ec51f43a8ff878841e574fa

md5-516b9f5b6ec51f43a8ff878841e574fa

<ul>

<p><li>A</li>
</ul></p>
invisibleroads commented 1 year ago

Current workaround is to make sure that there is a newline after the first starting div tag.

Crozzers commented 1 year ago

This one comes down to the first line having two opening tags and one close tag. The parser doesn't realise that the first tag isn't closed and that is what messes it up.

I've managed to get a patch working but I'll see if I can clean it up a bit before I submit a PR

invisibleroads commented 1 year ago

Regarding the test for the fix, the unordered list at the end is important. The markdown renders correctly if you omit the unordered list.

In [1]: x = '''
   ...: <div><div></div>
   ...: </div>
   ...: <div></div>
   ...: <div></div>
   ...: <div></div>
   ...: 
   ...: - A
   ...: '''
   ...: from markdown2 import markdown
   ...: print(markdown(x))
   ...: 
<div><div></div>

</div>

md5-688f5ae04dff1a2c8b9235b8be711c7e

md5-688f5ae04dff1a2c8b9235b8be711c7e

md5-688f5ae04dff1a2c8b9235b8be711c7e

<ul>

<p><li>A</li>
</ul></p>

In [2]: x = '''
   ...: <div><div></div>
   ...: </div>
   ...: <div></div>
   ...: <div></div>
   ...: <div></div>
   ...: 
   ...: '''
   ...: from markdown2 import markdown
   ...: print(markdown(x))
<div><div></div>

<p></div></p>

<div></div>

<div></div>

<div></div>
invisibleroads commented 1 year ago

Oh never mind, I see the test has the unordered list at the end. Sorry about that

Crozzers commented 1 year ago

No worries! I had originally omitted it, assuming it was unimportant but realised the mistake