mixmark-io / turndown

🛏 An HTML to Markdown converter written in JavaScript
https://mixmark-io.github.io/turndown
MIT License
8.62k stars 870 forks source link

Lists over 9 items long break with indented content #410

Open FossPrime opened 2 years ago

FossPrime commented 2 years ago

The following html

<ol start="9">
<li>
    <p><strong>I came in like a wrecking ball</strong></p>
    <p><img src="https://upload.wikimedia.org/wikipedia/commons/b/bd/Test.svg" alt=""></p>
</li>
<li>
    <p><strong>I never hit so hard in love</strong></p>
    <p><img src="https://upload.wikimedia.org/wikipedia/commons/b/bd/Test.svg" alt=""></p>
</li>
</ol>

Should be converted to:

 9. **I came in like a wrecking ball**

    ![](https://upload.wikimedia.org/wikipedia/commons/b/bd/Test.svg)

10. **I never hit so hard in love**

    ![](https://upload.wikimedia.org/wikipedia/commons/b/bd/Test.svg)

But instead it's converted to this non-commonmark compliant code:

9.  **I came in like a wrecking ball**

    ![](https://upload.wikimedia.org/wikipedia/commons/b/bd/Test.svg)

10.  **I never hit so hard in love**

    ![](https://upload.wikimedia.org/wikipedia/commons/b/bd/Test.svg)

Which creates a code_block instead of regular text and images because of the following CommonMark spec:

Proposed solution

Remove whatever is adding the unwanted indentation between the list number and the list title. If this is intentional indentation, then move it to the beginning of the line, not between the list item.

That currently breaks 14 tests:

not ok 103 should be equal
not ok 104 should be equal
not ok 105 should be equal
not ok 106 should be equal
not ok 107 should be equal
not ok 108 should be equal
not ok 115 should be equal
not ok 116 should be equal
not ok 119 should be equal
not ok 120 should be equal
not ok 127 should be equal
not ok 128 should be equal
not ok 139 should be equal
not ok 140 should be equal

Additional notes: