pemistahl / lingua

The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Apache License 2.0
689 stars 61 forks source link

Rewrite Lingua in Java #188

Open pemistahl opened 10 months ago

pemistahl commented 10 months ago

As I most probably won't have much time in the future to maintain Lingua due to my family life, let's rewrite Lingua in Java as long as I have time. My hope is to attract more people from the Java community to use and improve this library in the future. Having a plain Java library should simplify long-term maintenance as well due to Java's backwards compatibility.

Kotlin is a great language but a second-class citizen on the JVM. Its use did not provide significant advantages compared to a plain Java implementation. I have doubts whether Kotlin will continue to have a bright future as both Java as a language and the JVM get significantly better, especially with the release of Java 21.

rogierslag commented 8 months ago

Hi @pemistahl!

I really like this library, and have started to work on migrating the core library itself (excluding reports etc for now) to Java. It's still a work in progress, but I'll file a PR once the test suite is fully passing.

If interested, you can find the in progress branch here

pemistahl commented 8 months ago

Hi Rogier, wow, that's awesome that you are deliberately doing this tedious amount of work. Looks very good so far. Thanks a lot. :)

Compared to my other implementations of Lingua, this one here has been stuck because I could not yet motivate myself to do the rewrite. If your work is successful, I will then add the new features from the other implementations to the Java rewrite. So please pay attention to do a 1-to-1 rewrite of the current Kotlin main branch only. Again, thank you very much. :)