pemistahl / lingua

The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Apache License 2.0
689 stars 61 forks source link

Language recognition enhancement #162

Closed mohamedniyaz1996 closed 1 year ago

mohamedniyaz1996 commented 1 year ago

I saw few words like "camera", "television, "dumbell", "spider man guitar" etc. not getting recognised as english. Hence opening this as to check if it can be enhanced in future. I have more samples, if required will post them on-demand. Thanks for reading.

pemistahl commented 1 year ago

Hi Mohamed, thanks for reaching out to me. Statistical approaches to language detection are never 100% correct. So it's no surprise that you've found words that are incorrectly identified. Of course, I will try to improve the library in the future. I'm sure you have seen the currently open issues that I will work on. But I don't need any samples from you in order to improve the library. Thanks anyway.