anoopkunchukuttan / indic_nlp_library

Resources and tools for Indian language Natural Language Processing
http://anoopkunchukuttan.github.io/indic_nlp_library/
MIT License
546 stars 158 forks source link

Placement of Anuswara #43

Open shantanuo opened 3 years ago

shantanuo commented 3 years ago

While using syllabifier class, the anuswara is carried over to the next character.

'जगदीशचंद्र' becomes ज ग दी श च ंद्र

This is technically correct. But there are times when someone may need a different representation like 'ज ग दी श चं द्र ' There should be an option for this as well.

shantanuo commented 3 years ago

The word 'इंपाला' is returned as इ ंपा ला (3 characters). But the word 'इंसान' is not returned as इ ंसा न or इं सा न It returns इंसा न that is not consistent with earlier example because it returns 2 characters instead of 3.

anoopkunchukuttan commented 3 years ago

Thanks for your inputs. Let me take a look at the issue you mention in a couple of days.