kherud / java-llama.cpp

Java Bindings for llama.cpp - A Port of Facebook's LLaMA model in C/C++
MIT License
279 stars 28 forks source link

Version 3.0 #52

Closed kherud closed 5 months ago

kherud commented 5 months ago

Version 3.0 reworks almost all of the C++ code. It heavily relies on the llama.cpp server code, which theoretically should lead to much better performance, concurrency, and long-term maintainability.

The biggest change is how model and inference parameters are handled. Previous versions relied on properly typed Java classes, whereas the C++ server code mostly uses JSON. The JNI code to transfer the parameters from Java to C++ was complex and error-prone. The new version comes with almost no API changes regarding how parameters are handled (apart from the available parameters per se), but should be much easier to maintain in the long term.