aaronsw / html2text

Convert HTML to Markdown-formatted text.
http://www.aaronsw.com/2002/html2text/
GNU General Public License v3.0
2.61k stars 412 forks source link

wrapwrite doesn't encode output #4

Closed stefanor closed 13 years ago

stefanor commented 13 years ago
$ python html2text.py http://google.com/
Traceback (most recent call last):
  File "html2text.py", line 473, in <module>
    wrapwrite(html2text(data, baseurl))
  File "html2text.py", line 436, in wrapwrite
    def wrapwrite(text): sys.stdout.write(text)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xbb' in position 85: ordinal not in range(128)

This is not an issue with Python3.

aaronsw commented 13 years ago

This is an issue for me in Python3 as well, but I couldn't figure out how to implement a solution that worked in both 2 and 3.

stefanor commented 13 years ago

Aah, if I set LANG to a non-UTF-8 locale (such as C) then it'll fail with python3.

I've got a proposed solution in my encode_output branch. Pull request incoming...

aaronsw commented 13 years ago

Fixed in 6c72d90.