Open nakroy opened 1 month ago
By the error message, Xinference seems not adapt OpenAI compatible API for ' Mistral-Nemo-Instruct-2407'. You could submit a feature request at github of Xinference.
By the error message, Xinference seems not adapt OpenAI compatible API for ' Mistral-Nemo-Instruct-2407'. You could submit a feature request at github of Xinference.
Alright, I may test llama-3.1 and see if this model can work fine. I would figure out and check whether Xinference can fully support Nemo models...
Is there an existing issue for the same bug?
Branch name
v0.12.0
Commit ID
na
Other environment information
Actual behavior
I create an agent for testing. When I run for test chat, the agent can answer well( The LLM backend is Xinference and the LLM I use is Mistral-Nemo-Instruct-2407).
But when I try using API and starting a Web App, it failed to answer the question, with an error response: ERROR: An error occurred during streaming
Expected behavior
API Web App can chat as normal as basic chat.
Steps to reproduce
Additional information
Xinference Worker Error log: