mukel / llama3.java

Practical Llama 3 inference in Java
MIT License
514 stars 61 forks source link

Bundle all standalone models in a single project. #8

Open mukel opened 3 months ago

mukel commented 3 months ago

So far I've ported the following models to Java: Llama 3 & 3.1, Mistral/Codestral/Mathstral/Nemostral (+ Tekken tokenizer), Qwen2, Phi3 and Gemma 1 & 2 ... All models are bundled as a single ~2K lines Java file with no dependencies, at this point, maintaining all these is a burden, some components can be shared e.g. GGUF parser, tensors, tokenizers, samplers, chat formats, even the inference (e.g. same for Mistral and Llama), would be great to have all of them as a single project.

All behind a common, low-level inference API e.g. forward implementation. Note that this is not meant high-level abstraction like langchain4j, but a low-level inference engine that can be used as a backend by langchain4j.

geoand commented 1 week ago

Note that this is not meant high-level abstraction like langchain4j, but a low-level inference engine that can be used as a backend by langchain4j

I am interested in building the latter based on this :). Is there anything I should be aware of?