puckel / docker-airflow

Docker Apache Airflow
Apache License 2.0
3.78k stars 544 forks source link

Some workers seem to have died and gunicorn did not restart them as expected #631

Open polosatyi opened 3 years ago

polosatyi commented 3 years ago

Hey guys!

I follow the official documentation https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html I am trying to launch airflow locally using the latest image 2.0.1. But the webserver doesn't want to start.

airflow-webserver_1  | BACKEND=postgresql+psycopg2
airflow-webserver_1  | DB_HOST=postgres
airflow-webserver_1  | DB_PORT=5432
airflow-webserver_1  |
airflow-webserver_1  | [2021-03-03 08:22:02,781] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
airflow-webserver_1  | [2021-03-03 08:22:02,860] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
airflow-webserver_1  |   ____________       _____________
airflow-webserver_1  |  ____    |__( )_________  __/__  /________      __
airflow-webserver_1  | ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
airflow-webserver_1  | ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
airflow-webserver_1  |  _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
airflow-webserver_1  | [2021-03-03 08:22:02,963] {dagbag.py:448} INFO - Filling up the DagBag from /dev/null
airflow-webserver_1  | [2021-03-03 08:22:03,869] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
airflow-webserver_1  | [2021-03-03 08:22:03,931] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
airflow-webserver_1  | [2021-03-03 08:22:06 +0000] [25] [INFO] Starting gunicorn 19.10.0
airflow-webserver_1  | [2021-03-03 08:22:06 +0000] [25] [INFO] Listening at: http://0.0.0.0:8080 (25)
airflow-webserver_1  | [2021-03-03 08:22:06 +0000] [25] [INFO] Using worker: sync
airflow-webserver_1  | [2021-03-03 08:22:06 +0000] [36] [INFO] Booting worker with pid: 36
airflow-webserver_1  | [2021-03-03 08:22:06 +0000] [37] [INFO] Booting worker with pid: 37
airflow-webserver_1  | [2021-03-03 08:22:06 +0000] [38] [INFO] Booting worker with pid: 38
airflow-webserver_1  | [2021-03-03 08:22:06 +0000] [39] [INFO] Booting worker with pid: 39
airflow-webserver_1  | Running the Gunicorn Server with:
airflow-webserver_1  | Workers: 4 sync
airflow-webserver_1  | Host: 0.0.0.0:8080
airflow-webserver_1  | Timeout: 120
airflow-webserver_1  | Logfiles: - -
airflow-webserver_1  | Access Logformat:
airflow-webserver_1  | =================================================================
airflow-webserver_1  | [2021-03-03 08:22:13,289] {webserver_command.py:255} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected
[2021-03-03 08:35:19,848] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
[2021-03-03 08:35:20,081] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
BACKEND=postgresql+psycopg2
DB_HOST=postgres
DB_PORT=5432

[2021-03-03 08:35:24,956] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
[2021-03-03 08:35:25,024] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
  ____________       _____________
 ____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
[2021-03-03 08:35:25,075] {dagbag.py:448} INFO - Filling up the DagBag from /dev/null
[2021-03-03 08:35:25,872] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
[2021-03-03 08:35:25,937] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
[2021-03-03 08:35:28 +0000] [23] [INFO] Starting gunicorn 19.10.0
[2021-03-03 08:35:28 +0000] [23] [INFO] Listening at: http://0.0.0.0:8080 (23)
[2021-03-03 08:35:28 +0000] [23] [INFO] Using worker: sync
[2021-03-03 08:35:28 +0000] [27] [INFO] Booting worker with pid: 27
[2021-03-03 08:35:28 +0000] [28] [INFO] Booting worker with pid: 28
[2021-03-03 08:35:28 +0000] [29] [INFO] Booting worker with pid: 29
[2021-03-03 08:35:28 +0000] [30] [INFO] Booting worker with pid: 30
[2021-03-03 08:35:31,512] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
[2021-03-03 08:35:31,512] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
[2021-03-03 08:35:31,511] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
[2021-03-03 08:35:31,515] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
BACKEND=postgresql+psycopg2
DB_HOST=postgres
DB_PORT=5432

[2021-03-03 08:35:36,315] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
[2021-03-03 08:35:36,384] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
  ____________       _____________
 ____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
[2021-03-03 08:35:36,433] {dagbag.py:448} INFO - Filling up the DagBag from /dev/null
[2021-03-03 08:35:37,292] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
[2021-03-03 08:35:37,360] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.microsoft.azure.hooks.wasb.WasbHook' from 'apache-airflow-providers-microsoft-azure' package: No module named 'azure.storage.blob'
Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/__main__.py", line 40, in main
    args.func(args)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/cli/cli_parser.py", line 48, in command
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/cli.py", line 89, in wrapper
    return f(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/cli/commands/webserver_command.py", line 368, in webserver
    check_if_pidfile_process_is_running(pid_file=pid_file, process_name="webserver")
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/process_utils.py", line 266, in check_if_pidfile_process_is_running
    raise AirflowException(f"The {process_name} is already running under PID {pid}.")
airflow.exceptions.AirflowException: The webserver is already running under PID 23.
spwats commented 3 years ago

same issue here, were you able to resolve this?

spwats commented 3 years ago

fwiw I was able to work around this by getting rid of the worker, redis instance, and flower since for my purposes I only need to use the LocalExecutor. My docker-compose.yaml looks like this now:

---
version: '3'
x-airflow-common:
  &airflow-common
  image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.0.1}
  environment:
    &airflow-common-env
    AIRFLOW__CORE__EXECUTOR: LocalExecutor
    AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
    AIRFLOW__CORE__FERNET_KEY: ''
    AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
    AIRFLOW__CORE__LOAD_EXAMPLES: 'true'
  volumes:
    - ./dags:/opt/airflow/dags
    - ./logs:/opt/airflow/logs
    - ./plugins:/opt/airflow/plugins
  user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-50000}"
  depends_on:
    postgres:
      condition: service_healthy

services:
  postgres:
    image: postgres:13
    environment:
      POSTGRES_USER: airflow
      POSTGRES_PASSWORD: airflow
      POSTGRES_DB: airflow
    volumes:
      - postgres-db-volume:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "airflow"]
      interval: 5s
      retries: 5
    restart: always

  airflow-webserver:
    <<: *airflow-common
    command: webserver
    ports:
      - 8080:8080
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
      interval: 10s
      timeout: 10s
      retries: 5
    restart: always

  airflow-scheduler:
    <<: *airflow-common
    command: scheduler
    restart: always

  airflow-init:
    <<: *airflow-common
    command: version
    environment:
      <<: *airflow-common-env
      _AIRFLOW_DB_UPGRADE: 'true'
      _AIRFLOW_WWW_USER_CREATE: 'true'
      _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
      _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}

volumes:
  postgres-db-volume:
andrkhar commented 3 years ago

I had the same issue. I increased allocation of resources in docker preference CPU 2 -> 4, Memory 2GB -> 6GB, SWAP 1GB -> 2GB. The issue has gone.

Could be not enough memory?