Alir3z4 / html2text

Convert HTML to Markdown-formatted text.
alir3z4.github.io/html2text/
GNU General Public License v3.0
1.76k stars 270 forks source link

<ul> nested inside <ol> needs three space indent, not two #344

Closed snarfed closed 3 years ago

snarfed commented 3 years ago

First off, thank you for maintaining html2text, it's great!

Right now, list items are always indented two spaces per enclosing list. This is generally right, but doesn't work for <ul>s nested inside <ol>s. Those need three spaces (at least) instead. Details here in the Common Mark spec.

I have a PR ready for this, I'll submit it in a minute.

For example, this HTML:

<ol>
  <li>ordered</li>
  <ul>
    <li>unordered</li>
  </ul>
</ol>

needs to convert to this, with five spaces before * unordered instead of four:

  1. ordered
     * unordered
  3. end

Tested against HEAD with multiple Python 3 versions.