danswer-ai / danswer

Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
https://docs.danswer.dev/
Other
10.14k stars 1.18k forks source link

Search fails to generate AI answer using embedding w/ Ollama and Mistral as of 0.3.37 #1159

Open adut opened 6 months ago

adut commented 6 months ago

I have have created multiple connectors, Web, Jira, Confluence, Local Files, Github and all were working with embedding and generating answers when using search, e.g "summarize X" or "explain Y" in v0.3.36.

Here is an example:

image

It would display the search results and an AI answer using embedding.

As of 0.3.37 it still shows the search results, but in AI answer it only shows "Information Not Found"

image

I am running the docker-compose deploy on my local machine.

Following is the output in my log file for the query, from the api-server, when it fails, not sure if that's helpful. Let me know if there are any other logs I can pull for you or information I can provide that will be helpfull:

2024-03-01 13:19:07 03/01/2024 09:19:07 PM              main.py 243 : Using LLM Provider: ollama
2024-03-01 13:19:07 03/01/2024 09:19:07 PM              main.py 245 : Using LLM Model Version: mistral
2024-03-01 13:19:07 03/01/2024 09:19:07 PM              main.py 249 : Using LLM Endpoint: http://host.docker.internal:11434
2024-03-01 13:19:07 03/01/2024 09:19:07 PM          chat_llm.py  49 : LLM Model Class: ChatLiteLLM, Model Config: {'model': 'ollama/mistral', 'api_base': 'http://host.docker.internal:11434', 'request_timeout': 240.0, 'model_kwargs': {'frequency_penalty': 0, 'presence_penalty': 0}, 'n': 1, 'max_tokens': 1024}
2024-03-01 13:19:07 03/01/2024 09:19:07 PM              main.py 275 : Using Embedding model: "intfloat/e5-base-v2"
2024-03-01 13:19:07 03/01/2024 09:19:07 PM              main.py 277 : Query embedding prefix: "query: "
2024-03-01 13:19:07 03/01/2024 09:19:07 PM              main.py 280 : Passage embedding prefix: "passage: "
2024-03-01 13:19:07 03/01/2024 09:19:07 PM              main.py 292 : Warming up local NLP models.
2024-03-01 13:19:07 /usr/local/lib/python3.11/site-packages/transformers/utils/hub.py:128: FutureWarning: Using `DISABLE_TELEMETRY` is deprecated and will be removed in v5 of Transformers. Use `HF_HUB_DISABLE_TELEMETRY` instead.
2024-03-01 13:19:07   warnings.warn(
2024-03-01 13:19:09 03/01/2024 09:19:09 PM search_nlp_models.py 103 : Loading intfloat/e5-base-v2
2024-03-01 13:19:13 All model checkpoint layers were used when initializing TFDistilBertForSequenceClassification.
2024-03-01 13:19:13 
2024-03-01 13:19:13 All the layers of TFDistilBertForSequenceClassification were initialized from the model checkpoint at danswer/intent-model.
2024-03-01 13:19:13 If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.
2024-03-01 13:19:13 03/01/2024 09:19:13 PM              main.py 302 : GPU is not available
2024-03-01 13:19:13 03/01/2024 09:19:13 PM              main.py 303 : Torch Threads: 10
2024-03-01 13:19:13 03/01/2024 09:19:13 PM              main.py 305 : Verifying query preprocessing (NLTK) data is downloaded
2024-03-01 13:19:13 03/01/2024 09:19:13 PM              main.py 310 : Verifying default connector/credential exist.
2024-03-01 13:19:13 03/01/2024 09:19:13 PM              main.py 315 : Loading default Prompts and Personas
2024-03-01 13:19:13 03/01/2024 09:19:13 PM              main.py 319 : Verifying Document Index(s) is/are available.
2024-03-01 13:19:14 INFO:     Application startup complete.
2024-03-01 13:19:14 INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
2024-03-01 13:19:04 Starting Danswer Api Server
2024-03-01 13:19:19 INFO:     192.168.16.7:55858 - "GET /health HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40476 - "GET /manage/me HTTP/1.1" 403 Forbidden
2024-03-01 13:19:19 INFO:     192.168.16.6:40462 - "GET /auth/type HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40486 - "GET /manage/document-set HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40510 - "GET /query/valid-tags HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40520 - "GET /secondary-index/get-embedding-models HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40502 - "GET /persona HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40484 - "GET /manage/indexing-status HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.7:55884 - "GET /health HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40462 - "GET /auth/type HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40476 - "GET /manage/me HTTP/1.1" 403 Forbidden
2024-03-01 13:19:19 INFO:     192.168.16.6:40520 - "GET /secondary-index/get-embedding-models HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40510 - "GET /query/valid-tags HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40486 - "GET /manage/document-set HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40502 - "GET /persona HTTP/1.1" 200 OK
2024-03-01 13:19:19 INFO:     192.168.16.6:40484 - "GET /manage/indexing-status HTTP/1.1" 200 OK
2024-03-01 13:19:34 INFO:     192.168.16.7:55886 - "POST /query/stream-query-validation HTTP/1.1" 200 OK
2024-03-01 13:19:34 INFO:     192.168.16.7:55888 - "POST /query/stream-answer-with-quote HTTP/1.1" 200 OK
2024-03-01 13:19:38 INFO:     192.168.16.7:55868 - "GET /manage/admin/genai-api-key/validate HTTP/1.1" 200 OK
2024-03-01 13:19:34 03/01/2024 09:19:34 PM     query_backend.py 154 : Received query for one shot answer with quotes: when did the maui wildfires occur
2024-03-01 13:19:34 03/01/2024 09:19:34 PM     query_backend.py 141 : Validating query: when did the maui wildfires occur
2024-03-01 13:19:34 03/01/2024 09:19:34 PM            timing.py  31 : retrieval_preprocessing took 0.0870513916015625 seconds
2024-03-01 13:19:35 03/01/2024 09:19:35 PM            timing.py  31 : doc_index_retrieval took 0.5002865791320801 seconds
2024-03-01 13:19:35 03/01/2024 09:19:35 PM     search_runner.py  55 : Top links from hybrid search: https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires, https://en.wikipedia.org/wiki/2023_Hawaii_wildfires
2024-03-01 13:20:00 03/01/2024 09:20:00 PM          qa_utils.py 204 : answer=None
2024-03-01 13:20:00 03/01/2024 09:20:00 PM            timing.py  66 : stream_search_answer took 25.94691038131714 seconds
yuhongsun96 commented 5 months ago

Hello, looks like some things changed since the last time I tested with Ollama, brought things up to date and it should be working now. Guide also updated: https://docs.danswer.dev/gen_ai_configs/ollama

Will just mention though that the Ollama model options are still worse than the OpenAI options. If possible, we suggest using GPT4 or GPT4-Turbo. With the guide above, a lot of features are turned off impacting the quality of the experience.

Iriamu19 commented 5 months ago

Hi @yuhongsun96,

Could you explain why many features need to be disabled when using Ollama? If I understood correctly, based on the .env file mentioned in the documentation for Ollama, prompts are not properly optimized for models other than GPT-4, right? Would it be challenging to adjust these prompts for use with a self-hosted model? It would be great to be able to use Mixtral 8x7B, for instance, whose performance is comparable to GPT-3.5. Implementing these improvements would enable users to utilize Danswer entirely in an on-premises environment.

Thanks for your work!

dmikulin-dwave commented 3 months ago

I'm experience the same Ollama problem running the latest versions of Ollama and the latest danswer code. It's frustrating because the information is clearly right there in the search results below the text box. Periodically, I also seem to get three 500 errors in a row near the initial set of calls to Ollama:

[GIN] 2024/05/07 - 18:02:12 | 500 | 49.822208ms | 127.0.0.1 | POST "/api/generate" [GIN] 2024/05/07 - 18:02:12 | 500 | 68.797667ms | 127.0.0.1 | POST "/api/generate" [GIN] 2024/05/07 - 18:02:12 | 500 | 38.366917ms | 127.0.0.1 | POST "/api/generate"