Open esponiyo opened 1 month ago
It looks like the issue you're encountering with the chat feature still attempting to use OpenAI, despite setting OLLAMA_API_BASE_URL
in your .env
file, might be related to how the application is configured to switch between AI services. Ensure that the OLLAMA_API_BASE_URL
line in your .env
file is not commented out. This is crucial for activating the Ollama configuration.
If the OLLAMA_API_BASE_URL
is correctly set and the problem persists, the issue could be with the application logic, particularly in how it decides which AI service to use for the chat feature. The application might still default to OpenAI if it doesn't recognize the model name as compatible with Ollama or if the model name specifically indicates an OpenAI model, such as gpt-3.5-turbo-0125
.
The error log you've shared indicates an authentication error with OpenAI, suggesting that the application is indeed attempting to use OpenAI's services, likely due to the reasons mentioned above. To resolve this, verify that the model you're using for the chat feature is compatible with Ollama and that the application's logic correctly switches to using Ollama when the OLLAMA_API_BASE_URL
is set.
If after these checks the issue remains, it might be helpful to review the application's code related to the AI service selection process, particularly in the QuivrRAG
class and its methods for creating language model instances and generating answers. This could provide further insights into why the application is not switching to Ollama as expected.
Let me know if you need further assistance or if there's more information you can provide to help diagnose the issue.
To continue the conversation, mention @dosu.
User testing with private documents could expose private chunks of those documents to OpenAI, thinking that all data is still on-premise.
my understanding that you need configure model https://docs.quivr.app/configuring/new-model and the model name should be prepended with ollama/
prefix. In this case LiteLLM turns to ollama URL.
What happened?
Modifying .env file to use Ollama: 1.- Using a fake key to skip OpenAI integration 2.- OLLAMA_API_BASE_URL=http://host.docker.internal:11434 3.- Compose again with rebuild
Checks: 1.- Ollama server running and checked with line command 2.- Quivr: Embeddings for uploaded docs are done with Ollama 3.- Quivr: Chat is still trying with OpenAI, of course with error in frontend because of fake api 4.- Error log (attached) shows what is obvious
Relevant log output
NOTE:
This seems to be a bug and what is worse, a security fault, because chunks of private documents could be uploaded to OpenAI, if api was correct, as part of the behavior of a RAG flow.