matthewwithanm / python-markdownify

Convert HTML to Markdown
MIT License
1k stars 135 forks source link

Space before `<ol>` or `<ul>` element incorrectly indents the first list item #144

Open chrispy-snps opened 1 week ago

chrispy-snps commented 1 week ago

If an <ol> or <ul> element follows another element on the same line, and there is a space before it:

from markdownify import markdownify as md

html = """
<p>Follow these steps:</p> <ol>
   <li><p>Do this.</p></li>
   <li><p>Do that.</p></li>
</ol>"""

print(md(html))

then that space incorrectly indents the first list item:

Follow these steps:

 1. Do this.
2. Do that.

As a workaround, we preprocess our Beautiful Soup object as follows:

for n in node_markdown.find_all(["ol", "ul"]):
    n.insert_before("\n")
jsm28 commented 1 week ago

Should be fixed by my PR #120.