anoopkunchukuttan / indic_nlp_library

Resources and tools for Indian language Natural Language Processing
http://anoopkunchukuttan.github.io/indic_nlp_library/
MIT License
546 stars 158 forks source link

Inappropriate Hindi English Transliteration #52

Closed Sonali210 closed 1 year ago

Sonali210 commented 2 years ago

Code:

from indicnlp.transliterate.unicode_transliterate import ItransTransliterator
from indicnlp import loader
from indicnlp import common
common.set_resources_path(INDIC_RESOURCES_PATH)
loader.load()
ItransTransliterator.to_itrans('मैं आज आपकी किस प्रकार सहायता कर सकता हूँ?', 'hi')

Output:

mai.m aaja aapakii kisa prakaara sahaayataa kara sakataa huuँ?

Output using google translator:

Main aaj aapakee kis prakaar sahaayata kar sakata hoon?

There is unnecessary use of '.' and 'uँ' in romanization. What can be the best solution that gives appropriate and presentable transliterated output.

anoopkunchukuttan commented 1 year ago

What IndicNLP library supports is a deterministic transliteration to s defined scheme (itrans) in this case. In case you want to use natural transliteration - please use the IndicXlit model we developed at AI4Bharat.

https://github.com/AI4Bharat/IndicXlit