Open karolzlot opened 2 years ago
This is how it looks in debugger:
Hello,
I reproduced:
>>> html2text = HTML2Text()
>>> html2text.handle("hello")
'hello\n\n'
>>> html2text.handle("hello")
'hello\n\n' # Consistent without HTML tags.
>>>
>>> html2text = HTML2Text()
>>> html2text.handle("<h2>hello</h2>")
'## hello\n\n'
>>> html2text.handle("<h2>hello</h2>")
'\n\n## hello\n\n' # We have an extra '\n\n' at the beginning
>>>
>>> html2text = HTML2Text()
>>> html2text.handle("<strong>hello</strong>")
'**hello**\n\n'
>>> html2text.handle("<strong>hello</strong>")
' **hello**\n\n' # We have an extra ' ' at the beginning
Version by
html2text --version
html2text==2020.1.16Python version
python --version
3.8.8 on Ubuntu WSL2Test script:
Results:
(Should be 3 x
True
) (I suggest adding unit test for this issue, after this is fixed)index.html file (encoded in UTF-8):