Closed 401557122 closed 5 months ago
Hello, @401557122! I'm here to help you with any bugs, questions, or contributions while you wait for a human maintainer. Let's tackle this issue together!
To ensure that document processing tasks run without affecting other service calls in Dify version 0.6.8 when using Celery and Docker, you can configure separate Celery workers for different types of tasks. This can be achieved by defining different queues and assigning specific workers to those queues.
Here is an example of how you can modify your docker-compose.yaml
to achieve this:
version: '3'
services:
# API service
api:
image: langgenius/dify-api:0.6.11
restart: always
environment:
MODE: api
LOG_LEVEL: INFO
SECRET_KEY: sk-9f73s3ljTXVcMT3Blb3ljTqtsKiGHXVcMT3BlbkFJLK7U
CONSOLE_WEB_URL: ''
INIT_PASSWORD: ''
CONSOLE_API_URL: ''
SERVICE_API_URL: ''
APP_WEB_URL: ''
FILES_URL: ''
FILES_ACCESS_TIMEOUT: 300
MIGRATION_ENABLED: 'true'
DB_USERNAME: postgres
DB_PASSWORD: difyai123456
DB_HOST: db
DB_PORT: 5432
DB_DATABASE: dify
REDIS_HOST: redis
REDIS_PORT: 6379
REDIS_USERNAME: ''
REDIS_PASSWORD: difyai123456
REDIS_USE_SSL: 'false'
REDIS_DB: 0
CELERY_BROKER_URL: redis://:difyai123456@redis:6379/1
WEB_API_CORS_ALLOW_ORIGINS: '*'
CONSOLE_CORS_ALLOW_ORIGINS: '*'
STORAGE_TYPE: local
STORAGE_LOCAL_PATH: storage
S3_USE_AWS_MANAGED_IAM: 'false'
S3_ENDPOINT: 'https://xxx.r2.cloudflarestorage.com'
S3_BUCKET_NAME: 'difyai'
S3_ACCESS_KEY: 'ak-difyai'
S3_SECRET_KEY: 'sk-difyai'
S3_REGION: 'us-east-1'
AZURE_BLOB_ACCOUNT_NAME: 'difyai'
AZURE_BLOB_ACCOUNT_KEY: 'difyai'
AZURE_BLOB_CONTAINER_NAME: 'difyai-container'
AZURE_BLOB_ACCOUNT_URL: 'https://<your_account_name>.blob.core.windows.net'
GOOGLE_STORAGE_BUCKET_NAME: 'yout-bucket-name'
GOOGLE_STORAGE_SERVICE_ACCOUNT_JSON_BASE64: 'your-google-service-account-json-base64-string'
ALIYUN_OSS_BUCKET_NAME: 'your-bucket-name'
ALIYUN_OSS_ACCESS_KEY: 'your-access-key'
ALIYUN_OSS_SECRET_KEY: 'your-secret-key'
ALIYUN_OSS_ENDPOINT: 'https://oss-ap-southeast-1-internal.aliyuncs.com'
ALIYUN_OSS_REGION: 'ap-southeast-1'
ALIYUN_OSS_AUTH_VERSION: 'v4'
TENCENT_COS_BUCKET_NAME: 'your-bucket-name'
TENCENT_COS_SECRET_KEY: 'your-secret-key'
# Celery worker for general tasks
worker_general:
image: langgenius/dify-api:0.6.11
restart: always
environment:
MODE: worker
LOG_LEVEL: INFO
SECRET_KEY: sk-9f73s3ljTXVcMT3Blb3ljTqtsKiGHXVcMT3BlbkFJLK7U
DB_USERNAME: postgres
DB_PASSWORD: difyai123456
DB_HOST: db
DB_PORT: 5432
DB_DATABASE: dify
REDIS_HOST: redis
REDIS_PORT: 6379
REDIS_USERNAME: ''
REDIS_PASSWORD: difyai123456
REDIS_DB: 0
CELERY_BROKER_URL: redis://:difyai123456@redis:6379/1
CELERY_QUEUE: general
# Celery worker for document processing tasks
worker_document:
image: langgenius/dify-api:0.6.11
restart: always
environment:
MODE: worker
LOG_LEVEL: INFO
SECRET_KEY: sk-9f73s3ljTXVcMT3Blb3ljTqtsKiGHXVcMT3BlbkFJLK7U
DB_USERNAME: postgres
DB_PASSWORD: difyai123456
DB_HOST: db
DB_PORT: 5432
DB_DATABASE: dify
REDIS_HOST: redis
REDIS_PORT: 6379
REDIS_USERNAME: ''
REDIS_PASSWORD: difyai123456
REDIS_DB: 0
CELERY_BROKER_URL: redis://:difyai123456@redis:6379/1
CELERY_QUEUE: document
# Other services (db, redis, etc.) remain unchanged
db:
image: postgres:15-alpine
restart: always
environment:
PGUSER: postgres
POSTGRES_PASSWORD: difyai123456
POSTGRES_DB: dify
PGDATA: /var/lib/postgresql/data/pgdata
volumes:
- ./volumes/db/data:/var/lib/postgresql/data
healthcheck:
test: [ "CMD", "pg_isready" ]
interval: 1s
timeout: 3s
retries: 30
redis:
image: redis:6-alpine
restart: always
volumes:
- ./volumes/redis/data:/data
command: redis-server --requirepass difyai123456
healthcheck:
test: [ "CMD", "redis-cli", "ping" ]
In this configuration, two separate Celery workers are defined: worker_general
for general tasks and worker_document
for document processing tasks. Each worker is assigned to a different queue (general
and document
respectively) using the CELERY_QUEUE
environment variable. This separation ensures that document processing tasks do not interfere with other service calls [1].
To continue talking to Dosu, mention @dosu.
你太帅了机器哥
谢谢夸奖!如果你有其他问题或需要进一步的帮助,请随时告诉我。祝你使用顺利!
To continue talking to Dosu, mention @dosu.
I configured it according to the above method, but it still doesn't work. I configured unstructured, and after uploading the file, 1. First, api cpu 100%, unstructured cpu usage is low, worker_general cpu usage is low, and the front end cannot be used. 2. Then worker_general execution 100%, worker_api cpu usage is low, and the front end can be used. 3. worker_general cpu 50% worker_api cpu 100%
@dosu @dosubot
Self Checks
Dify version
0.6.8
Cloud or Self Hosted
Self Hosted (Source)
Steps to reproduce
您好,我是使用docker-compose.middleware.yaml部署的相关中间件,并且使用源码运行flask run --host 0.0.0.0 --port=5001以及celery -A app.celery worker -P gevent -c 1 -Q dataset,generation,mail --loglevel INFO,并使用web的镜像直接搭建前端。 在使用中我发现,当我点击页面的知识库中的保存并处理按钮,也就是documents接口或者datasets/init接口,因为文档多或文档内容多的时候,会持续处理,此时,其余的智能体、页面相关接口,调用均是卡死的状态无法返回内容。 请问这应该如何解决,我起初以为是celery worker设置的1,我改为128仍出现该问题
✔️ Expected Behavior
希望文档处理任务在运行时互不影响,且不影响其余服务调用
❌ Actual Behavior
No response