Open egalion opened 10 years ago
This is not only a problem with Cyrillic text but with every text that is not just English or classical Latin (i.e. only uses ASCII). It would be enough to replace open(args.source, "r")
with codecs.open(args.source, "r", encoding="UTF-8")
, or even add an encoding parameter. This is a little less hacky than sys.setdefaultencoding('utf8')
.
The current version doesn't work with cyrillic texts. It gives a Unicode error.
More specifically:
I found a workaround after some googling. It may not be very elegant, but it does the job. It applies to the command line tool criticParser_CLI.py. I am not a programmer, so maybe there is a better way to do it.
First, this section
should become
Then this section
Should become