these strings are recognized in ITALIAN but they are in english

pemistahl / lingua-py

The most accurate natural language detection library for Python, suitable for short text and mixed-language text

Apache License 2.0

1.1k stars 44 forks source link

these strings are recognized in ITALIAN but they are in english #84

Closed andreabisello closed 1 year ago

andreabisello commented 1 year ago

these strings are recognized in italian, even they are in english (and there is any italian word in this)

Invalid Request Filter: '{0}' is negative or null.
Error: {0}
Error: {0} was in Language

pemistahl commented 1 year ago

Yes, this is certainly possible. The sum of the ngram probabilities for Italian will be larger than the sum of the ngram probabilities for English. This is not a bug in the library. The statistical approach is never 100% correct.