fix the incompatibility of Ollama and Groq JSON's response
update the default model selection
add the support for third-party Openai-proxy server
The incompatibility of Ollama and Groq JSON's response
This problem has been mentioned in issue 1, issue 2, issue 3.
It's mainly caused by the function calling compatibility required by instructor JSON's response, which seems not to work very well in Groq's llama and Ollama's model(looks like a bug of litellm) when using expert search and generates related queries.
To solve this problem, it's suggested that JSON mode rather than tools be used in the instructor response model when the model is Groq and Ollama.
Based on my tests, this change will stabilize Groq and Ollama’s structured generation.
Update the default model selection
change the default "fast" model from GPT-3.5-turbo to GPT-4o-mini, using a more powerful model without increasing cost.
update the Groq model from llama3-70b to llama3.1-70b, also changed the Ollama llama3 to llama3.1.
Third-party Openai-proxy server
add the support for third-party Openai-proxy server by including the "OPENAI_API_BASE" env variable in the docker-compose file.
The changes included in this commit:
The incompatibility of Ollama and Groq JSON's response
This problem has been mentioned in issue 1, issue 2, issue 3. It's mainly caused by the function calling compatibility required by instructor JSON's response, which seems not to work very well in Groq's llama and Ollama's model(looks like a bug of litellm) when using expert search and generates related queries. To solve this problem, it's suggested that JSON mode rather than tools be used in the instructor response model when the model is Groq and Ollama. Based on my tests, this change will stabilize Groq and Ollama’s structured generation.
Update the default model selection
Third-party Openai-proxy server
add the support for third-party Openai-proxy server by including the "OPENAI_API_BASE" env variable in the docker-compose file.