Closed cyflhn closed 1 day ago
I found some relevant information regarding timeout and retry settings for xinference in Dify. The default timeout is set to 10 seconds, and the retry setting is configured for a maximum of 3 retries. These settings are defined in the XinferenceHelper
class within the xinference_helper.py
file [1].
Currently, there isn't specific documentation or configuration options mentioned for customizing these settings directly in Dify's Docker setup or through environment variables. However, you might consider modifying the xinference_helper.py
file directly to adjust these settings if necessary.
If you're experiencing issues with long response times leading to timeouts and retries, you might need to explore optimizing the performance of your xinference setup or handling load pressure more effectively. Unfortunately, there isn't specific guidance available in the documentation for these scenarios.
To continue talking to Dosu, mention @dosu.
Self Checks
Dify version
0.10.0
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
✔️ Expected Behavior
A normal message should be responsed.
❌ Actual Behavior
If the response time for xinference is quite long, there must be a timeout error occured in dify and retry 3 times by default. There could lead to more serious load pressure for xinference.