Closed Alex-Kopylov closed 1 year ago
Btw, I've only briefly checked existence of these words in Italian, French and German languages. But anyway, basing on spread of these words, I assume that English variant should be on the first place in prediction. Please correct me If I'm wrong.
Pure statistical approaches to language detection are never 100% correct. The letter sequences in your examples are not only common in English, but even more common in Italian or French. That's why the probabilities for Italian and French are higher than the probability for English.
Feed longer strings into the detector. Then you will get more reliable results.
Hello
Bye
Loss (not Löss)