Performance Degradation after Load Testing on Dify - Low Throughput and Concurrency

yancyuu commented 1 week ago

Self Checks

[X] This is only for bug report, if you would like to ask a question, please head to Discussions.
[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
[X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.6.12

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

After conducting load testing on Dify, we noticed significant performance degradation. Below are the details:

•   System Specs:
•   8 cores, 16 GB RAM
•   2 pods deployed
•   Load Testing Results:
•   Without model integration:
•   Concurrency: Low
•   Throughput: Only reaching up to 11 requests per second
•   With model integration:
•   Concurrency: Even lower
•   Throughput: Only 6 requests per second

This performance seems unexpectedly low for the given system specifications. Could there be an issue with the configuration or is there any optimization suggestion to improve the throughput and concurrency?

Any guidance on this would be appreciated. Thank you!

This issue summary outlines the performance problems and requests guidance for potential fixes or optimizations.

✔️ Expected Behavior

并发达到200

❌ Actual Behavior

吞吐量只能到6

dosubot[bot] commented 1 week ago

I found a similar open issue and discussion that might be relevant to your problem:

Open Issue: Uploading PDFs to the knowledge bank can disrupt other users' regular usage. The suggested solution involves tweaking several configuration settings to optimize resource usage and improve server performance, such as increasing the number of server workers, adjusting timeouts, and optimizing the database connection pool ^[1].
Discussion: How to scale up the dify deployment. This discussion includes various suggestions for scaling up Dify deployment, which might help in improving throughput and concurrency ^[2].

To optimize the throughput and concurrency for Dify after experiencing significant performance degradation during load testing on a self-hosted Docker setup, you can follow these steps:

Increase Resource Allocation: Ensure that your Docker setup has sufficient resources. The minimum system requirements are:
- CPU: 2 Cores
- RAM: 4GB
However, for better performance, you might need to allocate more resources. For example, set the Docker virtual machine (VM) to use more virtual CPUs (vCPUs) and increase the memory allocation.
Update Docker Compose Configuration: Make sure you are using the latest version of Dify and Docker Compose. Dify 0.6.12 introduced significant enhancements to Docker Compose deployment.
Optimize Docker Compose File: Adjust the Docker Compose file to better handle load. You can increase the number of replicas for the services that are bottlenecks. For example, you can scale the api and worker services:
```
version: '3.8'
services:
 api:
   image: langgenius/dify-api:0.6.13
   deploy:
     replicas: 3
   ports:
     - "5001:5001"
   ...

 worker:
   image: langgenius/dify-api:0.6.13
   deploy:
     replicas: 3
   ...
```

Load Balancing: Use a load balancer to distribute the incoming requests evenly across the replicas. You can configure Nginx as a load balancer in your Docker Compose setup:

nginx:
 image: nginx:latest
 restart: always
 volumes:
   - ./nginx/nginx.conf.template:/etc/nginx/nginx.conf.template
   - ./nginx/proxy.conf.template:/etc/nginx/proxy.conf.template
   - ./nginx/https.conf.template:/etc/nginx/https.conf.template
   - ./nginx/conf.d:/etc/nginx/conf.d
   - ./nginx/docker-entrypoint.sh:/docker-entrypoint-mount.sh
   - ./nginx/ssl:/etc/ssl
   - ./volumes/certbot/conf/live:/etc/letsencrypt/live
   - ./volumes/certbot/conf:/etc/letsencrypt
   - ./volumes/certbot/www:/var/www/html
 entrypoint: [ "sh", "-c", "cp /docker-entrypoint-mount.sh /docker-entrypoint.sh && sed -i 's/\r$$//' /docker-entrypoint.sh && chmod +x /docker-entrypoint.sh && /docker-entrypoint.sh" ]
 environment:
   NGINX_SERVER_NAME: ${NGINX_SERVER_NAME:-_}
   NGINX_HTTPS_ENABLED: ${NGINX_HTTPS_ENABLED:-false}
   NGINX_SSL_PORT: ${NGINX_SSL_PORT:-443}
   NGINX_PORT: ${NGINX_PORT:-80}
   NGINX_SSL_CERT_FILENAME: ${NGINX_SSL_CERT_FILENAME:-dify.crt}
   NGINX_SSL_CERT_KEY_FILENAME: ${NGINX_SSL_CERT_KEY_FILENAME:-dify.key}
   NGINX_SSL_PROTOCOLS: ${NGINX_SSL_PROTOCOLS:-TLSv1.1 TLSv1.2 TLSv1.3}
   NGINX_WORKER_PROCESSES: ${NGINX_WORKER_PROCESSES:-auto}
   NGINX_CLIENT_MAX_BODY_SIZE: ${NGINX_CLIENT_MAX_BODY_SIZE:-15M}
   NGINX_KEEPALIVE_TIMEOUT: ${NGINX_KEEPALIVE_TIMEOUT:-65}
   NGINX_PROXY_READ_TIMEOUT: ${NGINX_PROXY_READ_TIMEOUT:-3600s}
   NGINX_PROXY_SEND_TIMEOUT: ${NGINX_PROXY_SEND_TIMEOUT:-3600s}
   NGINX_ENABLE_CERTBOT_CHALLENGE: ${NGINX_ENABLE_CERTBOT_CHALLENGE:-false}
   CERTBOT_DOMAIN: ${CERTBOT_DOMAIN:-}
 depends_on:
   - api
   - web
 ports:
   - "${EXPOSE_NGINX_PORT:-80}:${NGINX_PORT:-80}"
   - "${EXPOSE_NGINX_SSL_PORT:-443}:${NGINX_SSL_PORT:-443}"

Database Optimization: Ensure that your database (Postgres) is optimized for performance. This might include tuning the database configuration parameters and ensuring that the database has enough resources.
Monitoring and Logging: Implement monitoring and logging to identify performance bottlenecks. Tools like Prometheus and Grafana can help you monitor the performance of your Docker containers.
Environment Variables: Ensure that the environment variables in your .env file are correctly configured and optimized for your setup. Sync the environment variable configuration as described in the documentation:
```
cd dify/docker
docker compose down
git pull origin main
docker compose pull
docker compose up -d
```
Make sure to update your local .env file with any new variables from .env.example. Key environment variables to consider include:
- SERVER_WORKER_AMOUNT: Number of server workers, calculated as cpu cores x 2 + 1.
- SERVER_WORKER_CLASS: Default is gevent, but can be changed to sync or solo for Windows.
- GUNICORN_TIMEOUT: Request processing timeout, default is 200, recommended 360 to support longer SSE connections.
- CELERY_WORKER_AMOUNT: Number of Celery workers, default is 1, adjust as needed.
- SQLALCHEMY_POOL_SIZE: Database connection pool size, default is 30, can be increased.
- SQLALCHEMY_POOL_RECYCLE: Database connection pool recycle time, default is 3600 seconds.

By following these steps and properly configuring the environment variables, you should be able to optimize the throughput and concurrency for Dify in your self-hosted Docker setup ^[3]^[4]^[5].

To continue talking to Dosu, mention @dosu.

Seayon commented 1 week ago

Have you configured the SERVER_WORKER_AMOUNT parameter in your environment setting? @yancyuu

langgenius / dify