Mimino666 / langdetect

Port of Google's language-detection library to Python.
Other
1.71k stars 196 forks source link

Inaccurate predictions for basic english words #70

Open grestonian opened 4 years ago

grestonian commented 4 years ago

library is unable to detect language for basic english words and hence generates poor inaccurate results as depicted below. detect("sunday") => 'id' | whereas clearly 'sunday' in indonesian is minggu detect("monday") => 'tr' | whereas 'monday' in turkish is 'pazartesi' and surprisingly, detect('pazartesi') => 'es'

Infact, langdetect.deteect_langs("sunday") outputs confidences for 'tr' and 'id', and no mention of english whatsoever. same goes for months, and other basic english words, eg detect("good") => 'so

nightfuryyy commented 4 years ago

"son", "song",...