anoopkunchukuttan / indic_nlp_library

Resources and tools for Indian language Natural Language Processing
http://anoopkunchukuttan.github.io/indic_nlp_library/
MIT License
549 stars 160 forks source link

Morphogical analyser #13

Closed ashnamp closed 6 years ago

ashnamp commented 7 years ago

self._script_range_pat=ur'^[{}-{}]+$'.format(unichr(langinfo.SCRIPT_RANGES[lang][0]),unichr(langinfo.SCRIPT_RANGES[lang][1])) ^ SyntaxError: invalid syntax

anoopkunchukuttan commented 7 years ago

Which language are you using the unsupervised morph for?

ashnamp commented 7 years ago

python3.5

anoopkunchukuttan commented 7 years ago

I haven't tested on python 3.5, but my question was regarding which language's morph analyzer (Hindi, Malayalam, etc.) were you trying to use?

ashnamp commented 7 years ago

Malayalam

anoopkunchukuttan commented 7 years ago

I checked Python2.7, the code works fine. On Python 3.5 I was able to reproduce the same error you mention. I will have to check what need to be done to support Python 3.5. As of now, the library has been test for Python 2 versions only.

anoopkunchukuttan commented 7 years ago

I happened to fix another bug while debugging this one. So pull the latest checkin for the fix

arcturusannamalai commented 6 years ago

In Python3 generators need to be forced to list via 'list()' call before you may index it. This may just work.

erzaliator commented 5 years ago

Anyone interested in running the code in python 3 can check out this forked repo: https://github.com/erzaliator/indic_nlp_library/. Do note that this won't run for python 2.

Basically, py 3 treats each string as unicode so, there's no need to declare a string as unicode explicitly (eg- str1= ur'hello' changes to str1=r'hello'). Plus, unichar has been discontinued to be replaced with chr.

arcturusannamalai commented 5 years ago

@erzaliator - can you use a library like 'six' to make it straddle the 2-3 divide ? That way you could even recommit code back to this project.