aboSamoor / polyglot

Multilingual text (NLP) processing toolkit
http://polyglot-nlp.com
Other
2.28k stars 337 forks source link

Underscores make sentences detected as English? #264

Open ndvbd opened 1 year ago

ndvbd commented 1 year ago

This sentence is detected as French, is 98 probabliity:

Celles qui n'encouragent guère, emprises de jalousie.

Chaning one char to underscore:

Celles qui n'encouragent gu_re, emprises de jalousie

Gives English in 98 prob. Clearly some bug. Any ideas?