tesseract-ocr / tesstrain

Train Tesseract LSTM with make
Apache License 2.0
626 stars 180 forks source link

unicode error python #18

Closed orangebacked closed 6 years ago

orangebacked commented 6 years ago

I modified the .py file because I kept getting a unicode error:

I just inserted three more lines of code at the beggining of the file

import io
import argparse
import unicodedata
from PIL import Image
import sys
reload(sys)
sys.setdefaultencoding('utf-8')

cheers

kba commented 6 years ago

You are using Python 2.7 presumably, this should not be an issue in py3 anymore.

Feel free to post the actual error message so we can see how to encode properly for py2.

Also, if it happens to be an issue with printing unicode to your console, setting PYTHONIOENCODING=utf8 on the shell is always a handy fix.