Closed wuminghui-coder closed 2 months ago
To enable support for streaming requests and tool calls for LLM models under Xinference in Dify version 0.6.11, you need to set the appropriate flags in the credentials. Specifically, you need to set support_function_call
to True
for tool calls and ensure that stream
is set to True
for streaming requests.
Here is the relevant code snippet that shows how these flags are used:
support_function_call = credentials.get('support_function_call', False)
if support_function_call:
features.append(ModelFeature.TOOL_CALL)
# ...
def _generate(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
model_parameters: dict, extra_model_kwargs: XinferenceModelExtraParameter,
tools: list[PromptMessageTool] | None = None,
stop: list[str] | None = None, stream: bool = True, user: str | None = None) \
-> LLMResult | Generator:
# ...
if tools and len(tools) > 0:
generate_config['tools'] = [
{
'type': 'function',
'function': helper.dump_model(tool)
} for tool in tools
]
# ...
if stream:
if tools and len(tools) > 0:
raise InvokeBadRequestError('xinference tool calls does not support stream mode')
return self._handle_chat_stream_response(model=model, credentials=credentials,
prompt_messages=prompt_messages,
tools=tools, resp=resp)
return self._handle_chat_generate_response(model=model, credentials=credentials,
prompt_messages=prompt_messages,
tools=tools, resp=resp)
To enable these features, you should configure your credentials as follows:
credentials = {
'server_url': 'your_server_url',
'model_uid': 'your_model_uid',
'support_function_call': True, # Enable tool calls
'context_length': 2048,
# other necessary credentials
}
# Ensure stream is set to True when invoking the model
result = xinference_model._generate(
model='your_model',
credentials=credentials,
prompt_messages=your_prompt_messages,
model_parameters=your_model_parameters,
tools=your_tools, # List of tools if any
stop=your_stop_conditions,
stream=True, # Enable streaming
user='your_user_id'
)
This configuration will enable support for both streaming requests and tool calls for LLM models under Xinference in Dify version 0.6.11.
However, please note that function calling of Xinference does not support stream mode currently, as indicated in the integration tests:
"""
Function calling of xinference does not support stream mode currently
"""
# def test_invoke_stream_chat_model_with_functions():
# model = XinferenceAILargeLanguageModel()
# ...
Therefore, while you can enable streaming or tool calls individually, combining them is not supported at this time [1][2].
To continue talking to Dosu, mention @dosu.
关闭
It looks like you want to close the issue regarding the integration of LLM request streaming between the Dify and Xinference platforms. If you have any further questions or need additional assistance in the future, feel free to reach out. Thank you!
To continue talking to Dosu, mention @dosu.
Self Checks
Dify version
0.6.11
Cloud or Self Hosted
Cloud
Steps to reproduce
dify跟xinference平台没有完全对接好,dify对xinference下的llm模型,不支持流式请求,流式工具调用也不支持,我自己加了打印,以及阅读了一下代码,这一块没有完全适配
✔️ Expected Behavior
No response
❌ Actual Behavior
No response