weaviate / Verba

Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
BSD 3-Clause "New" or "Revised" License
6.08k stars 650 forks source link

Using ollama in docker compse needs separate models for embeddings and chat #171

Closed kjeldahl closed 4 weeks ago

kjeldahl commented 4 months ago

Description

When running Verba using docker compose with OLLAMA_MODEL set to llama3 it is not possible add documents. It gives 404 on /api/embeddings in the ollama server. And it is not possible to get a chat response as it gets a similar response from the ollama server. If I change the OLLAMA_MODEL to mxbai-embed-large I can import documents and get chunks back as responses, but still no chat answer as 404 is returned on /api/chat in the ollama server.

Is this a bug or a feature?

Steps to Reproduce

Clone the project Add

OLLAMA_URL=http://localhost:11434
OLLAMA_MODEL=llama3

to .env Run docker compose up Start ollama server ollama serve Open http://localhost:8000 Enter something in chat and see it fail with 'NoneType' object is not iterable Also see error in ollama server log | 404 | 577.333µs | 127.0.0.1 | POST "/api/embeddings"

Try to import a document See a list of 404 on /api/embeddings in ollama server

Stop Verba Change OLLAMA_MODEL to mxbai-embed-large Start docker compose up Import a document and see a lot of 200 responses in ollama server Type something in chat See an ever looping Generating response widget and a 404 on /api/chat in ollama server Chunks are however returned.

I think the solution is to be able to specify an embedding model along with the chat model.

thomashacker commented 4 months ago

Thanks for the issue! Adding a new environment variable to split Generation and Embedding Model for Ollama is definitely on the list. We'll look into that 🚀

cha0s commented 4 months ago

I think this is a classic XY problem, because I see the error 'NoneType' object is not iterable that seems to be plaguing many people. I don't think the immediate solution is to add more configuration, although that would be cool to have!

kjeldahl commented 4 months ago

@chaos I have added a pull request which solves the problem, for me any way.

Part of the problem besides trying to chat with an embedding model was that I had not installed llama3in ollama but llama3:instruct which caused an error that was not reported in the ui.

kjeldahl commented 4 months ago

@cha0s If I use llama3 it works for both chat and embedding without having separate models.

thomashacker commented 4 weeks ago

We seperated both in the newest release!