This PR adds a circular fifo service and a Llama service. The FIFO can be used to store short-term conversational memory for LLM assistants like ChatGPT or Llama-based local models.
The Llama service is an experimental service I added just for proof of concept testing. It uses the new java-llama-cpp library to call into a llama.cpp compiled library to perform inference. Yes, the dependency uses JNA; I just quickly whipped it up so I can do some tests with the FIFO service. Eventually, it will be replaced with a Python-based service.
Currently, the llama service requires java 13 or higher, and the user must supply their own libllama.so library file
This PR adds a circular fifo service and a Llama service. The FIFO can be used to store short-term conversational memory for LLM assistants like ChatGPT or Llama-based local models.
The Llama service is an experimental service I added just for proof of concept testing. It uses the new
java-llama-cpp
library to call into a llama.cpp compiled library to perform inference. Yes, the dependency uses JNA; I just quickly whipped it up so I can do some tests with the FIFO service. Eventually, it will be replaced with a Python-based service.Currently, the llama service requires java 13 or higher, and the user must supply their own
libllama.so
library file