pemistahl / lingua

The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Apache License 2.0
706 stars 63 forks source link

Replace native implementation with Java bindings to WASM module #214

Open pemistahl opened 1 month ago

pemistahl commented 1 month ago

I've been maintaining four implementations of this library, namely Go, Rust, Python, and this one in Kotlin. Due to serious time constraints, there will only be updates for the Rust and Python implementations in the future. The Go and Kotlin implementations will be replaced with bindings to a Web Assembly (WASM) module created from the Rust implementation.

It turned out that Rust is the best language for this kind of library. It has the highest speed of execution and the lowest memory requirements. Especially the huge amount of memory required by the native JVM implementation turned out to be a problem for multiple users.

In order to create the Java bindings, I will make use of chicory, a native JVM WASM runtime.