pemistahl / lingua-py

The most accurate natural language detection library for Python, suitable for short text and mixed-language text
Apache License 2.0
1.08k stars 44 forks source link

Does not detect Hindi #153

Closed vsmelov closed 1 year ago

vsmelov commented 1 year ago

from lingua import Language, LanguageDetectorBuilder languages = [Language.ENGLISH, Language.HINDI] detector = LanguageDetectorBuilder.from_languages(*languages).build() confidence_values = detector.compute_language_confidence_values("Bhai aapka isme") for language, value in confidence_values: print(f"{language.name}: {value:.2f}")

ENGLISH: 1.00 HINDI: 0.00

That's 100% wrong

pemistahl commented 1 year ago

Lingua currently does not support transliterations into non-standard alphabets. If you want Hindi to be correctly identified, please use the Devanagari alphabet.