QuivrHQ / quivr

Open-source RAG Framework for building GenAI Second Brains 🧠 Build productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Efficient retrieval augmented generation framework
https://quivr.com
Other
34.12k stars 3.34k forks source link

[Bug]: it's still using OpenAI after set Ollama in .env file #2599

Open esponiyo opened 1 month ago

esponiyo commented 1 month ago

What happened?

Modifying .env file to use Ollama: 1.- Using a fake key to skip OpenAI integration 2.- OLLAMA_API_BASE_URL=http://host.docker.internal:11434 3.- Compose again with rebuild

Checks: 1.- Ollama server running and checked with line command 2.- Quivr: Embeddings for uploaded docs are done with Ollama 3.- Quivr: Chat is still trying with OpenAI, of course with error in frontend because of fake api 4.- Error log (attached) shows what is obvious

Relevant log output

backend-core  |     | litellm.exceptions.AuthenticationError: OpenAIException - Error code: 401 - {'error': {'message': 'Incorrect API key provided: tk-aabbc***************************************qQqQ. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}} 
backend-core  |     | Model: gpt-3.5-turbo-0125
backend-core  |     | API Base: https://api.openai.com

NOTE:

This seems to be a bug and what is worse, a security fault, because chunks of private documents could be uploaded to OpenAI, if api was correct, as part of the behavior of a RAG flow.

dosubot[bot] commented 1 month ago

It looks like the issue you're encountering with the chat feature still attempting to use OpenAI, despite setting OLLAMA_API_BASE_URL in your .env file, might be related to how the application is configured to switch between AI services. Ensure that the OLLAMA_API_BASE_URL line in your .env file is not commented out. This is crucial for activating the Ollama configuration.

If the OLLAMA_API_BASE_URL is correctly set and the problem persists, the issue could be with the application logic, particularly in how it decides which AI service to use for the chat feature. The application might still default to OpenAI if it doesn't recognize the model name as compatible with Ollama or if the model name specifically indicates an OpenAI model, such as gpt-3.5-turbo-0125.

The error log you've shared indicates an authentication error with OpenAI, suggesting that the application is indeed attempting to use OpenAI's services, likely due to the reasons mentioned above. To resolve this, verify that the model you're using for the chat feature is compatible with Ollama and that the application's logic correctly switches to using Ollama when the OLLAMA_API_BASE_URL is set.

If after these checks the issue remains, it might be helpful to review the application's code related to the AI service selection process, particularly in the QuivrRAG class and its methods for creating language model instances and generating answers. This could provide further insights into why the application is not switching to Ollama as expected.

Let me know if you need further assistance or if there's more information you can provide to help diagnose the issue.

#### Details **Sources** - [.env.example](https://github.com/QuivrHQ/quivr/blob/main/.env.example) - [frontend/app/user/components/ApiKeyConfig/hooks/useApiKeyConfig.ts](https://github.com/QuivrHQ/quivr/blob/main/frontend/app/user/components/ApiKeyConfig/hooks/useApiKeyConfig.ts) - [backend/modules/brain/rags/quivr_rag.py](https://github.com/QuivrHQ/quivr/blob/main/backend/modules/brain/rags/quivr_rag.py)
**About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

esponiyo commented 1 month ago

Please consider this issue as a possible security fault.

A serious one:

User testing with private documents could expose private chunks of those documents to OpenAI, thinking that all data is still on-premise.

mkhludnev commented 3 weeks ago

my understanding that you need configure model https://docs.quivr.app/configuring/new-model and the model name should be prepended with ollama/ prefix. In this case LiteLLM turns to ollama URL.