digitalfabrik / integreat-chat

Interface to self-hosted large language models and vector databases to provide improved Integreat Chat functionality
https://integreat-app.de
MIT License
1 stars 0 forks source link

Switch to vLLM #58

Open svenseeberg opened 1 month ago

svenseeberg commented 1 month ago

Ollama has limitations, for example with available models and loading them. We may want to switch to vLLM.

svenseeberg commented 1 month ago

Almost done: https://git.verdigado.com/verdigado-Privileged/Salt/pulls/2226

However, we need to find models that fit into the graphic card memory.