CenterForOpenScience / pydocx

An extendable docx file format parser and converter
Other
183 stars 55 forks source link

illegal char problem #244

Closed DrJian closed 6 years ago

DrJian commented 6 years ago

Hi pydocx team, when I use pydocx to convert docx to html, It reports error, the err info is as follows,

Traceback (most recent call last): File "docx2htmlwithpydocx.py", line 18, in output.write(html) UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' in position 1499: illegal multibyte sequence

pydocx lib dont use utf-8? why it reports gbk codec error, It refused me, I'd like to receive your message soon, sincerely!

DrJian commented 6 years ago

When I use my server , It reports another error,

Traceback (most recent call last): File "docx2htmlwithpydocx.py", line 18, in output.write(html) UnicodeEncodeError: 'ascii' codec can't encode characters in position 1338-1341: ordinal not in range(128)

DrJian commented 6 years ago

It's my fault , set system encoding into utf-8 can fix it