langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
45.34k stars 6.38k forks source link

Dify studio become very slow after uploading and embedding a large number of files #6998

Closed ddnoodle closed 1 week ago

ddnoodle commented 1 month ago

Self Checks

Dify version

0.6.14

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

This issue has been met for two times. Create a knowledge base and upload a large number of files in one time, say 500 files. Choose high quality to using embedding model. After the upload and embedding is done 100%, the Dify become very slow, respond very slow when openning an app to edit or running a app. Issue can only be recovered by restart Dify dockers.

✔️ Expected Behavior

Dify respond normally after uploading/embedding large number of files are done.

❌ Actual Behavior

Dify respond very very sloe after uploading/embedding large number of files are done.

dosubot[bot] commented 1 month ago

To resolve performance issues in Dify Studio after uploading and embedding a large number of files, you can try the following steps:

  1. Limit the max_tokens Setting: Reducing the max_tokens setting can help lower the computational cost and time. For example, setting max_tokens=500 limits the output to the first 500 tokens, which can improve efficiency.

  2. Check for Rate Limits: Ensure that your embedding model API key has not reached its rate limit, as this can cause delays and performance issues.

  3. Optimize Document Segmentation: Properly segment long texts to improve processing efficiency. For example, segment texts by paragraphs or sentences depending on the task.

  4. Adjust Document Upload Limits: If you are using a locally deployed version, you can modify the document upload size and quantity limits to better suit your needs.

  5. Use Appropriate Models: Choose models that are optimized for handling long texts and complex tasks. For example, gpt-3.5-turbo and gpt-4 are recommended for their capabilities in understanding long text and cross-document reasoning.

  6. Monitor and Manage API Usage: Regularly check your API usage and ensure you have not exceeded your quota. If you encounter errors such as "Rate limit reached" or "Validation failed: You exceeded your current quota," you may need to upgrade your plan or manage your usage more effectively [1][2].

Additionally, you might want to consider increasing the hardware specifications of your self-hosted instance. The recommended minimum specifications are:

For handling a large number of files, you might need to consider higher specifications depending on the specific workload and performance requirements. It is advisable to monitor the system's performance and scale up resources as needed [3][4].

To continue talking to Dosu, mention @dosu.

ddnoodle commented 1 month ago

This reply does not answer my question.

crazywoola commented 1 month ago

Can your provide the container logs of api? Actually I am not sure about if this is related to frontend or backend.

ddnoodle commented 1 month ago

OK, I will provide when the issue was reproduced.

ddnoodle commented 2 weeks ago

Can your provide the container logs of api? Actually I am not sure about if this is related to frontend or backend.

I reproduced and found one abnormal that is overload of gunicorn. Once the system is stuck, the gunicorn cpu usage will become 100% for ever. The only way is to reboot the containers to recover. Please see below screen:

gunicorn stuck

For containers logs, I dont see any obvious error logs.

crazywoola commented 1 week ago

See https://github.com/langgenius/dify/issues/7677#issue-2487894362