Closed liuzhenghua closed 2 months ago
The issue you're experiencing is due to the fact that the current implementation of xinference does not support streaming mode. This is confirmed by the comments in the test file and the code in the llm.py
file:
"""
Function calling of xinference does not support stream mode currently
"""
if stream:
if tools and len(tools) > 0:
raise InvokeBadRequestError('xinference tool calls does not support stream mode')
Additionally, the BaseAgentRunner
class includes a check to determine if the model supports streaming tool calls, which further confirms that xinference does not support this feature:
# check if model supports stream tool call
llm_model = cast(LargeLanguageModel, model_instance.model_type_instance)
model_schema = llm_model.get_model_schema(model_instance.model, model_instance.credentials)
if model_schema and ModelFeature.STREAM_TOOL_CALL in (model_schema.features or []):
self.stream_tool_call = True
else:
self.stream_tool_call = False
Therefore, the inability to stream return results when using xinference as the model provider is a known limitation [1][2][3].
To continue talking to Dosu, mention @dosu.
i test the llm by xinference ,it can stream return results
i test the llm by xinference ,it can stream return results
Did you create an app of the agent type?
Self Checks
Dify version
0.6.11
Cloud or Self Hosted
Self Hosted (Source)
Steps to reproduce
✔️ Expected Behavior
❌ Actual Behavior