This is a very simple Flask application that provides a popular compatible API for other large language models.
Very useful if you have tests or lots of running Collaborative Agent Modules :-)
It currently supports Llama2, Mistral-7b and RWKV since these models can run pretty easily on local hardware which makes it a great fit for the agent use case.
Streaming is supported as well.
python3 -m venv venv
source venv/bin/activate
(or venv\Scripts\activate
on Windows)pip install -r requirements.txt
ln -s /mnt/ssd/models/rwkv models/rwkv
python app.py
.curl http://localhost:5000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer WE_DONT_NEED_NO_STINKING_TOKENS" \
-d '{
"model": "mistral-7b-instruct",
"messages": [{"role": "user", "content": "Hello!"}]
}'