Closed ashnamp closed 6 years ago
Which language are you using the unsupervised morph for?
python3.5
I haven't tested on python 3.5, but my question was regarding which language's morph analyzer (Hindi, Malayalam, etc.) were you trying to use?
Malayalam
I checked Python2.7, the code works fine. On Python 3.5 I was able to reproduce the same error you mention. I will have to check what need to be done to support Python 3.5. As of now, the library has been test for Python 2 versions only.
I happened to fix another bug while debugging this one. So pull the latest checkin for the fix
In Python3 generators need to be forced to list via 'list()' call before you may index it. This may just work.
Anyone interested in running the code in python 3 can check out this forked repo: https://github.com/erzaliator/indic_nlp_library/. Do note that this won't run for python 2.
Basically, py 3 treats each string as unicode so, there's no need to declare a string as unicode explicitly (eg- str1= ur'hello'
changes to str1=r'hello'
). Plus, unichar has been discontinued to be replaced with chr.
@erzaliator - can you use a library like 'six' to make it straddle the 2-3 divide ? That way you could even recommit code back to this project.
self._script_range_pat=ur'^[{}-{}]+$'.format(unichr(langinfo.SCRIPT_RANGES[lang][0]),unichr(langinfo.SCRIPT_RANGES[lang][1])) ^ SyntaxError: invalid syntax