langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
50.3k stars 7.21k forks source link

the api docker service took a long time to start #9755

Open Gaocarri opened 1 week ago

Gaocarri commented 1 week ago

Self Checks

Dify version

0.10.0、0.10.1

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

When I updated from 0.9.x to 0.10.0, after docker compose up, the api service was unavailable 502, and it took a long time before it could be used(next day i found it is available). Then when I upgraded from 0.10.0 to 0.10.1, the same thing happened and the api service is still unavailable now, here is the docker log

docker logs dfe44f71c60b

Running migrations
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
/app/api/.venv/lib/python3.10/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)

How to solve this problem

✔️ Expected Behavior

docker service starts normally

❌ Actual Behavior

docker service starts normally

dosubot[bot] commented 1 week ago

I found some similar issues and discussions that might be relevant to your problem:

  1. Local Docker Setup Failure - This discussion is unsolved and might provide some insights into Docker setup issues [1].

  2. About docker-compose deployment issues - Another unsolved discussion that could be related to your deployment issues [2].

  3. docker version can't start up after update to v1.0 - This closed issue suggests that a "No space left on device" error can cause the API server to shut down. The solution involves cleaning up Docker resources, checking disk usage, and possibly increasing Docker disk space [3].

Regarding your specific issue, the logs indicate that none of PyTorch, TensorFlow, or Flax have been found, which might affect model availability. Additionally, there's a warning about missing ffmpeg, which is required for certain functionalities like OpenAI TTS. You should ensure that these dependencies are installed and properly configured.

To address the migration delay, ensure that the MIGRATION_ENABLED environment variable is set to 'true', as migrations are executed before the application starts. Delays can occur if the migrations are extensive or if there are database connection issues [4].

Make sure to install ffmpeg on your system to resolve the related warning. This can be done using package managers like apt on Ubuntu or brew on macOS.

To continue talking to Dosu, mention @dosu.

Gaocarri commented 1 week ago

1 hour later, i found it success,what happened

Running migrations
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
/app/api/.venv/lib/python3.10/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
Preparing database migration...
Starting database migration.
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
Database migration successful!
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
/app/api/.venv/lib/python3.10/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
[2024-10-24 03:01:46 +0000] [1] [INFO] Starting gunicorn 22.0.0
[2024-10-24 03:01:46 +0000] [1] [INFO] Listening at: http://0.0.0.0:5001 (1)
[2024-10-24 03:01:46 +0000] [1] [INFO] Using worker: gevent
[2024-10-24 03:01:46 +0000] [87] [INFO] Booting worker with pid: 87
wanshichenguang commented 1 week ago

我也经常遇见这种问题,每次更新像渡劫,更新完就进不去了。 真的不敢更新了

Gaocarri commented 1 week ago

我也经常遇见这种问题,每次更新像渡劫,更新完就进不去了。 真的不敢更新了

Did you have the problem starting from 0.10.x?

crazywoola commented 1 week ago

We have received several reports regarding this from 3 minutes to 1 hour. Not sure what's happening yet.

Gaocarri commented 1 week ago

We have received several reports regarding this from 3 minutes to 1 hour. Not sure what's happening yet.

this problem happened in my server machine, but in my macbook it start success in 2min. I noticed that the warning for pydub on the macbook appears after the database is successfully migrated, but on the server the warning for pydub appears before the database is successfully migrated

macbook :

2024-10-24 11:39:00 Running migrations
2024-10-24 11:39:17 sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
2024-10-24 11:39:17 sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
2024-10-24 11:39:29 Preparing database migration...
2024-10-24 11:39:29 Starting database migration.
2024-10-24 11:39:29 Database migration successful!
2024-10-24 11:39:42 sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
2024-10-24 11:39:42 sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
2024-10-24 11:39:48  
2024-10-24 11:39:48  -------------- celery@9546df7e67b5 v5.3.6 (emerald-rush)
2024-10-24 11:39:48 --- ***** ----- 
2024-10-24 11:39:48 -- ******* ---- Linux-6.6.31-linuxkit-aarch64-with-glibc2.40 2024-10-24 03:39:48
2024-10-24 11:39:48 - *** --- * --- 
2024-10-24 11:39:48 - ** ---------- [config]
2024-10-24 11:39:48 - ** ---------- .> app:         app_factory:0xffff24d828f0
2024-10-24 11:39:48 - ** ---------- .> transport:   redis://:**@redis:6379/1
2024-10-24 11:39:48 - ** ---------- .> results:     postgresql://postgres:**@db:5432/dify
2024-10-24 11:39:48 - *** --- * --- .> concurrency: 1 (gevent)
2024-10-24 11:39:48 -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
2024-10-24 11:39:48 --- ***** ----- 
2024-10-24 11:39:48  -------------- [queues]
2024-10-24 11:39:48                 .> app_deletion     exchange=app_deletion(direct) key=app_deletion
2024-10-24 11:39:48                 .> dataset          exchange=dataset(direct) key=dataset
2024-10-24 11:39:48                 .> generation       exchange=generation(direct) key=generation
2024-10-24 11:39:48                 .> mail             exchange=mail(direct) key=mail
2024-10-24 11:39:48                 .> ops_trace        exchange=ops_trace(direct) key=ops_trace
2024-10-24 11:39:48 
2024-10-24 11:39:48 [tasks]
2024-10-24 11:39:48   . schedule.clean_embedding_cache_task.clean_embedding_cache_task
2024-10-24 11:39:48   . schedule.clean_unused_datasets_task.clean_unused_datasets_task
2024-10-24 11:39:48   . tasks.add_document_to_index_task.add_document_to_index_task
2024-10-24 11:39:48   . tasks.annotation.add_annotation_to_index_task.add_annotation_to_index_task
2024-10-24 11:39:48   . tasks.annotation.batch_import_annotations_task.batch_import_annotations_task
2024-10-24 11:39:48   . tasks.annotation.delete_annotation_index_task.delete_annotation_index_task
2024-10-24 11:39:48   . tasks.annotation.disable_annotation_reply_task.disable_annotation_reply_task
2024-10-24 11:39:48   . tasks.annotation.enable_annotation_reply_task.enable_annotation_reply_task
2024-10-24 11:39:48   . tasks.annotation.update_annotation_to_index_task.update_annotation_to_index_task
2024-10-24 11:39:48   . tasks.batch_create_segment_to_index_task.batch_create_segment_to_index_task
2024-10-24 11:39:48   . tasks.clean_dataset_task.clean_dataset_task
2024-10-24 11:39:48   . tasks.clean_document_task.clean_document_task
2024-10-24 11:39:04 None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
2024-10-24 11:39:23 /app/api/.venv/lib/python3.10/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
2024-10-24 11:39:23   warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
2024-10-24 11:39:29 INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
2024-10-24 11:39:29 INFO  [alembic.runtime.migration] Will assume transactional DDL.
2024-10-24 11:39:33 None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
2024-10-24 11:39:44 /app/api/.venv/lib/python3.10/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
2024-10-24 11:39:44   warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
2024-10-24 11:39:48 /app/api/.venv/lib/python3.10/site-packages/celery/platforms.py:829: SecurityWarning: You're running the worker with superuser privileges: this is
2024-10-24 11:39:48 absolutely not recommended!

I can't tell if this is related to the issue, I'm just reporting the phenomenon

wanshichenguang commented 1 week ago

我也经常遇见这种问题,每次更新像渡劫,更新完就进不去了。 真的不敢更新了

Did you have the problem starting from 0.10.x?

I encountered this in previous versions as well.

Gaocarri commented 3 days ago

Hi, @laipz8200, i tried and same in 0.10.2, could u possibly consider this issue?

Hanfee commented 2 days ago

I am also finding that after updating to 0.10.2 , restarting the container is blocked by one and the restart fails!

Gaocarri commented 2 days ago

I am also finding that after updating to 0.10.2 , restarting the container is blocked by one and the restart fails!

I confirm that my problem occured from 0.9.2 -> 0.10.0, 0.9.2 is ok,did your 0.10.0 and 0.10.1 normal ?

cjhgit commented 1 day ago

I meet same problem. how to resolve it?

Gaocarri commented 1 day ago

I meet same problem. how to resolve it?

@cjhgit I could only wait 1 hour......