langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
45.54k stars 6.4k forks source link

After uploading files in the knowledge base, the status is always in queue. #7747

Open PsycheJ opened 2 weeks ago

PsycheJ commented 2 weeks ago

Self Checks

Dify version

0.7.0

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

It was running normally before, but now it is suddenly unable to build an index. The docker log does not see any abnormal error reports, and all the corresponding docker services are started normally. QQ截图20240828143607

✔️ Expected Behavior

How to check the problem?

❌ Actual Behavior

No response

dosubot[bot] commented 2 weeks ago

I found similar issues and discussions related to files being stuck in the "Queuing" status when uploading to the knowledge base in Dify:

  1. Issue #7675 (closed): The suggested solution was to check if the rate limit has been reached for the Embedding model API key used.
  2. Issue #5184 (closed): The solution involved checking document limits, subscription constraints, and ensuring the worker service is running correctly.
  3. Issue #696 (closed): The problem was related to insufficient OpenAI API key quotas, causing constant retries and blocking queue consumption [1].

For your specific case in a self-hosted Docker environment, you should ensure that the Celery worker service is running correctly. Here are the steps to verify and start the worker service:

  1. Check if the worker service is running: Use the following Docker command to list the status of all services defined in your docker-compose.yaml file:

    docker compose ps

    Look for the worker service in the output. It should show a status of Up if it is running correctly.

  2. Start the worker service if it is not running: If the worker service is not running, you can start it with:

    docker compose up -d worker
  3. Check the logs for the worker service: If the worker service is running but the file upload progress remains at 0, there might be an issue with the Celery worker processing the tasks. Check the logs for the worker service to diagnose any issues:

    docker compose logs worker

    Look for any errors or warnings in the logs that might indicate why the worker is not processing the file upload tasks.

  4. Ensure all necessary services are running: Make sure all other necessary services are running correctly. You can start all services with:

    cd dify/docker
    docker compose up -d

    Then, verify the status of all containers:

    docker compose ps

    The output should show all services, including api, worker, web, weaviate, db, redis, and nginx, with a status of Up. If any service is not running, it could affect the file upload process [2].

By following these steps, you should be able to determine if the worker service is running and diagnose any issues that might be causing the file upload progress to remain at 0.

To continue talking to Dosu, mention @dosu.

crazywoola commented 2 weeks ago

Please attach the contain logs.

tienldbnk commented 2 weeks ago

metoo

PsycheJ commented 2 weeks ago

worker-1 | worker: Warm shutdown (MainProcess) worker-1 | Running migrations worker-1 | None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. worker-1 | Preparing database migration... worker-1 | Start database migration. worker-1 | INFO [alembic.runtime.migration] Context impl PostgresqlImpl. worker-1 | INFO [alembic.runtime.migration] Will assume transactional DDL. worker-1 | Database migration successful! worker-1 | None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. worker-1 | /app/api/.venv/lib/python3.10/site-packages/celery/platforms.py:829: SecurityWarning: You're running the worker with superuser privileges: this is worker-1 | absolutely not recommended! worker-1 | worker-1 | Please specify a different user using the --uid option. worker-1 | worker-1 | User information: uid=0 euid=0 gid=0 egid=0 worker-1 | worker-1 | warnings.warn(SecurityWarning(ROOT_DISCOURAGED.format( worker-1 |
worker-1 | -------------- celery@d217d3763a91 v5.3.6 (emerald-rush) worker-1 | --- * ----- worker-1 | -- *** ---- Linux-5.15.0-116-generic-x86_64-with-glibc2.39 2024-08-28 07:49:45 worker-1 | - --- --- worker-1 | - ---------- [config] worker-1 | - ---------- .> app: app:0x7fb66f97ab00 worker-1 | - ---------- .> transport: redis://:@redis:6379/1 worker-1 | - ---------- .> results: postgresql://postgres:@db:5432/dify worker-1 | - --- --- .> concurrency: 1 (gevent) worker-1 | -- *** ---- .> task events: OFF (enable -E to monitor tasks in this worker) worker-1 | --- * ----- worker-1 | -------------- [queues] worker-1 | .> app_deletion exchange=app_deletion(direct) key=app_deletion worker-1 | .> dataset exchange=dataset(direct) key=dataset worker-1 | .> generation exchange=generation(direct) key=generation worker-1 | .> mail exchange=mail(direct) key=mail worker-1 | .> ops_trace exchange=ops_trace(direct) key=ops_trace worker-1 | worker-1 | [tasks] worker-1 | . schedule.clean_embedding_cache_task.clean_embedding_cache_task worker-1 | . schedule.clean_unused_datasets_task.clean_unused_datasets_task worker-1 | . tasks.add_document_to_index_task.add_document_to_index_task worker-1 | . tasks.annotation.add_annotation_to_index_task.add_annotation_to_index_task worker-1 | . tasks.annotation.batch_import_annotations_task.batch_import_annotations_task worker-1 | . tasks.annotation.delete_annotation_index_task.delete_annotation_index_task worker-1 | . tasks.annotation.disable_annotation_reply_task.disable_annotation_reply_task worker-1 | . tasks.annotation.enable_annotation_reply_task.enable_annotation_reply_task worker-1 | . tasks.annotation.update_annotation_to_index_task.update_annotation_to_index_task worker-1 | . tasks.batch_create_segment_to_index_task.batch_create_segment_to_index_task worker-1 | . tasks.clean_dataset_task.clean_dataset_task worker-1 | . tasks.clean_document_task.clean_document_task worker-1 | . tasks.clean_notion_document_task.clean_notion_document_task worker-1 | . tasks.deal_dataset_vector_index_task.deal_dataset_vector_index_task worker-1 | . tasks.delete_segment_from_index_task.delete_segment_from_index_task worker-1 | . tasks.disable_segment_from_index_task.disable_segment_from_index_task worker-1 | . tasks.document_indexing_sync_task.document_indexing_sync_task worker-1 | . tasks.document_indexing_task.document_indexing_task worker-1 | . tasks.document_indexing_update_task.document_indexing_update_task worker-1 | . tasks.duplicate_document_indexing_task.duplicate_document_indexing_task worker-1 | . tasks.enable_segment_to_index_task.enable_segment_to_index_task worker-1 | . tasks.mail_invite_member_task.send_invite_member_mail_task worker-1 | . tasks.mail_reset_password_task.send_reset_password_mail_task worker-1 | . tasks.ops_trace_task.process_trace_tasks worker-1 | . tasks.recover_document_indexing_task.recover_document_indexing_task worker-1 | . tasks.remove_app_and_related_data_task.remove_app_and_related_data_task worker-1 | . tasks.remove_document_from_index_task.remove_document_from_index_task worker-1 | . tasks.retry_document_indexing_task.retry_document_indexing_task worker-1 | . tasks.sync_website_document_indexing_task.sync_website_document_indexing_task worker-1 | worker-1 | [2024-08-28 07:49:45,427: INFO/MainProcess] Connected to redis://:@redis:6379/1 worker-1 | [2024-08-28 07:49:45,436: INFO/MainProcess] mingle: searching for neighbors worker-1 | [2024-08-28 07:49:46,469: INFO/MainProcess] mingle: all alone worker-1 | [2024-08-28 07:49:46,518: INFO/MainProcess] celery@d217d3763a91 ready. worker-1 | [2024-08-28 07:49:46,523: INFO/MainProcess] pidbox: Connected to redis://:@redis:6379/1. worker-1 | [2024-08-28 07:49:46,529: INFO/MainProcess] Task tasks.clean_dataset_task.clean_dataset_task[26c7c8f4-0135-4fa6-8bbf-2691b4f98c40] received worker-1 | [2024-08-28 07:49:46,531: INFO/MainProcess] Start clean dataset when dataset deleted: a7e90578-d5dd-4343-bc94-a2d9d3f2ad59 worker-1 | [2024-08-28 07:49:46,855: INFO/MainProcess] Cleaning documents for dataset: a7e90578-d5dd-4343-bc94-a2d9d3f2ad59 worker-1 | [2024-08-28 07:49:46,916: INFO/MainProcess] Task tasks.document_indexing_task.document_indexing_task[bda0893b-8c5f-4dba-bce3-71552ce38219] received @crazywoola I looked at the work service and didn't seem to make a mistake.

PsycheJ commented 2 weeks ago

@crazywoola I looked at the work service and didn't seem to make a mistake.

crazywoola commented 2 weeks ago

I think you can restart the containers to see if this persists.

AAEE86 commented 2 weeks ago

You should check if the models you are using is accessible.

Chunk0423 commented 2 weeks ago

I have the same problem. I've check the logs. It seems that redis has some problems. Temporary failure in name resolution

PsycheJ commented 2 weeks ago

I have the same problem. I've check the logs. It seems that redis has some problems. Temporary failure in name resolution

redis?How did you solve it?

Chunk0423 commented 2 weeks ago

I have the same problem. I've check the logs. It seems that redis has some problems. Temporary failure in name resolution

redis?How did you solve it?

I didn't solve it. I came here to search the solution

xuqingwei001 commented 1 week ago

image Error: Can't locate revision identified by 'd0187d6a88dd',How to solve it

401557122 commented 2 days ago

I think you can restart the containers to see if this persists.我认为您可以重新启动容器,看看这种情况是否仍然存在。

When using QA generation, due to the long time, it is easy to get stuck in the queue, but I don't want to restart. Restarting can indeed solve this problem. Can we fundamentally solve it without causing the service to crash even if it times out