anoopkunchukuttan / indic_nlp_library

Resources and tools for Indian language Natural Language Processing
http://anoopkunchukuttan.github.io/indic_nlp_library/
MIT License
549 stars 160 forks source link

Normalizer Not working with other Options #21

Closed bsaid5654 closed 4 years ago

bsaid5654 commented 5 years ago

This is the error it is throwing when I try any other option other than "do_nothing" can you please check why this is happening Traceback (most recent call last): File "indic_nlp_library/src/indicnlp/normalize/indic_normalize.py", line 720, in normalizer=factory.get_normalizer(language,remove_nuktas,normalize_nasals) File "indic_nlp_library/src/indicnlp/normalize/indic_normalize.py", line 680, in get_normalizer normalizer=TeluguNormalizer(lang=language, remove_nuktas=remove_nuktas, nasals_mode=nasals_mode) File "indic_nlp_library/src/indicnlp/normalize/indic_normalize.py", line 542, in init super(TeluguNormalizer,self).init(lang,remove_nuktas,nasals_mode) File "indic_nlp_library/src/indicnlp/normalize/indic_normalize.py", line 69, in init self._init_normalize_nasals() File "indic_nlp_library/src/indicnlp/normalize/indic_normalize.py", line 183, in _init_normalize_nasals self._init_to_anusvaara_strict() File "indic_nlp_library/src/indicnlp/normalize/indic_normalize.py", line 93, in _init_to_anusvaara_strict nasal=langinfo.offset_to_char(pat_signature[0],self.lang), File "indic_nlp_library/src/indicnlp/langinfo.py", line 91, in offset_to_char return chr(c+SCRIPT_RANGES[lang][0]) ValueError: chr() arg not in range(256)

anoopkunchukuttan commented 5 years ago

Can you share the call and arguments? Are you using Python 2 or 3?

anoopkunchukuttan commented 4 years ago

Not reproducible, most likely an issue of wrong file encoding or using Python2.