njsmith / pysrilm

An extremely simple Python wrapper for the SRI Language Modeling toolkit
BSD 2-Clause "Simplified" License
70 stars 20 forks source link

TypeError: expected bytes, str found when load model #10

Open ttpro1995 opened 6 years ago

ttpro1995 commented 6 years ago

I generate model with srilm

ngram-count -text VNESEcorpus.txt -order 3 -unk -lm vnese.lm

When I load with pysrilm, it need bytes, not str

>>> path = "/home/cpu11453local/workspace/SRILM_model/vnese.lm"
>>> lm = LM(path)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "srilm.pyx", line 110, in srilm.LM.__cinit__
    fp = new_File(path, "r")
TypeError: expected bytes, str found
njsmith commented 6 years ago

Yes, the examples in the README are python 2.

kushalarora commented 6 years ago

Is there a way to make this to work for python3?

kushalarora commented 6 years ago

I fixed this issue by doing path.encode(). This issue exists because of handling of strings in python2 and 3. By default, in python3 the str is a unicode. The encode handles this and you can use the code with python3.