aboSamoor / polyglot

Multilingual text (NLP) processing toolkit
http://polyglot-nlp.com
Other
2.32k stars 338 forks source link

Underscores make sentences detected as English? #264

Open ndvbd opened 2 years ago

ndvbd commented 2 years ago

This sentence is detected as French, is 98 probabliity:

Celles qui n'encouragent guère, emprises de jalousie.

Chaning one char to underscore:

Celles qui n'encouragent gu_re, emprises de jalousie

Gives English in 98 prob. Clearly some bug. Any ideas?