Issues with Ollama Mistral Integration: Search Queries Failing and Knowledge Chat Returning Blank Responses

vorsyybl commented 1 week ago

Hello,

I’m trying to get Mistral to answer questions based on connected and indexed Zendesk articles within Danswer, running locally through Ollama.

My setup is as follows: -Docker containers from within the docker_compose folder. Includes inference server, api-server, etc. -Ollama container where mistral is at local port:11434. -I fill in the Custom fields to get past the initial dialog box when you first connect to local port:3000 where Danswer is. And it seems to connect since there's no error at this point. -When I ask a question in Search, it pulls up the articles, but the AI throws an error. And when asking a question in Knowledge Chat, it thinks for a minute, and then resets, returns a blank, nothing.

I've tried: -Changing environment variables. -Allocating more resources in the containers. -Trying the above on a different machine. -Checking the logs in api-server container and ollama container for any clues.

Any ideas would be appreciated, thanks.

rkuo-danswer commented 2 days ago

Are there any logs related to your issues in the API server?

Typically if the error is AI related, we want the env var LOG_DANSWER_MODEL_INTERACTIONS to be set to True, then look in the logs for potential clues.

vorsyybl commented 1 day ago

Hey man, thanks for your response! I'm determined to get this setup working.

I'll try the .env approach as suggested in the guide first, adding this variable there, then I'll inspect the api-server logs for any additional details about what's happening when I punch in a question.

Will update with results.

EDIT Additional info:

I added PROXY_READ_TIMEOUT settings to the app.conf file in backend/data/nginx, set it to 60000 or something ridiculous like that.
Hopefully this rules out resource issue, per the guide, I used a .wslconfig file in Users local directory setting RAM to 20 GB and cores to 4, and see these settings reflected at the bottom of Docker app.
The error in API server before adding the suggested variable was either "Returned None" or "Returned empty string", then nginx shows a timeout error in log. I believe this would narrow the issue down to AI - related so I'm hoping I see something new in the logs of api-server container.
Got these errors when trying to get a response from the Knowledge-Mistral Chat feature:
- ERROR: Could not trace chat history.
- And in NGINX: upstream timed out (110: operation timed out) while reading upstream

I'm going to try a different model within the Ollama container, Llama 2, and see if I'm getting the same issues. Also should add, the dataset is ~ 900 articles through a Zendesk connector, not sure if that's too much, it doesn't seem like it would be, but could.

Same issues, the ollama container where mistral is says POST 200 when I ask knowledge a question, but then api server gets the data as None / non -string. I'm looking at the processed_streamed_output(self) function in answer.py and create_chat_chain in chat_utils.py, adding logger statements to see if I can figure out why.

UPDATE: Using the logger method I was able to see that, the 1 or 2 questions the general model would answer, you could see each word in the stream being logged into the console. And before that, danswer was able to load the chat session and id without issue. That's when I thought, maybe it is a resource issue, I have this setup on a laptop locally with 32 GB of ram and 12th gen intel core 15-1235U 1.3 ghz. I would've thought this was enough, and I allocated most of it to docker, but docker adds that extra layer of abstraction it's a virtual environment so it's probably not getting as much as it should.

So I tried doing a few api calls to the hugging face api within its free quota same model, mistral, and the knowledge chat responded really well, it's actually a great tool now that I've seen it work, answering questions based on the zendesk articles I connected to it. So in my opinion that rules out any issues with the danswer code that needs changing, other than more print statements to track data flow. I was able to connect danswer to model locally (llama container, even though it doesn't work), and also to the cloud via API key, doesn't that narrow it down to resources of the local setup?

Next I'm going to try and connect danswer to the Ollama app, which you download and install so it accesses the laptop's resources directly, and then downloading a free model from hugging face locally and loading it into the inference-server container, and see if it works.

I'm really glad to see however danswer in action, finally after weeks of troubleshooting ollama docker setup and getting the same "NON STRING" "RETURNED NONE" errors. They are in main.py and and answer.py

danswer-ai / danswer

Issues with Ollama Mistral Integration: Search Queries Failing and Knowledge Chat Returning Blank Responses #2821