goru001 / inltk

Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need
https://inltk.readthedocs.io
MIT License
824 stars 163 forks source link

Identify languages doesn't work with code-mixed languages in latin script #54

Open bnriiitb opened 4 years ago

bnriiitb commented 4 years ago

Is there a way to detect the language of the transliterated text of an Indian language?

goru001 commented 4 years ago

With v0.9, this essentially boils down to identifying code-mixed languages correctly.

Currently, identify_languages return english for all code mixed languages in latin script.