apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.12k stars 14.3k forks source link

Airflow webserver not starting when provided external DB #27654

Closed IDeepakI closed 1 year ago

IDeepakI commented 2 years ago

Official Helm Chart version

1.7.0 (latest released)

Apache Airflow version

2.3.1

Kubernetes Version

1.23.10

Helm Chart configuration

webserverSecretKey: 'ad2126ad-95c5-4050-b785-218f84b4bafa'
defaultAirflowRepository: 'airflow'
defaultAirflowTag: 'test'
images:
  airflow:
   pullPolicy: Never 
webserver:
    service:
      type: NodePort

    resources:
      limits:
        cpu: 500m
        memory: 1028Mi
config:
  core:
    enable_xcom_pickling: True
pgbouncer:
  # The maximum number of connections to PgBouncer
  maxClientConn: 100
  # The maximum number of server connections to the metadata database from PgBouncer
  metadataPoolSize: 10
  # The maximum number of server connections to the result backend database from PgBouncer
  resultBackendPoolSize: 5
  enabled: true

redis:
  enabled: false
data:
  brokerUrl: redis://redis.test.org:6379/0

postgresql:
  enabled: False

enableBuiltInSecretEnvVars:
  AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: False
  AIRFLOW__CELERY__RESULT_BACKEND: False
secret:
  - envName: "AIRFLOW__DATABASE__SQL_ALCHEMY_CONN"
    secretName: "airflow-metadata"
    secretKey: "AIRFLOW__DATABASE__SQL_ALCHEMY_CONN"
  - envName: "AIRFLOW__CELERY__RESULT_BACKEND"
    secretName: "airflow-metadata"
    secretKey: "AIRFLOW__CELERY__RESULT_BACKEND"

extraEnvFrom: |-
  - configMapRef:
      name: 'airflow-variables

Docker Image customisations

No response

What happened

I was running simple airflow on kubernetes using helm with a configuration specified. But the webserver not getting initialized.

here are the kubernetes pod

NAME                                   READY   STATUS      RESTARTS        AGE
airflow-pgbouncer-db449d6b-98ldf       2/2     Running     0               49m
airflow-run-airflow-migrations-6bdwg   0/1     Completed   0               49m
airflow-scheduler-9f78574d6-cq8dc      2/2     Running     1 (9m42s ago)   49m
airflow-statsd-7c7584d6f8-6ldmr        1/1     Running     0               49m
airflow-triggerer-6b4c678799-m6pg5     1/1     Running     4 (4m32s ago)   49m
airflow-webserver-5575454d74-tc9zm     0/1     Running     5 (94s ago)     49m
airflow-worker-0                       2/2     Running     0               49m

here is webserver logs

[2022-11-14 07:30:49,068] {manager.py:543} INFO - Removed Permission View: menu_access on Permissions
[2022-11-14 07:30:49,376] {manager.py:543} INFO - Removed Permission View: menu_access on Permissions
[2022-11-14 07:30:49,377] {manager.py:543} INFO - Removed Permission View: menu_access on Permissions
[2022-11-14 07:30:49,377] {manager.py:543} INFO - Removed Permission View: menu_access on Permissions
[2022-11-14 07:31:01,429] {manager.py:508} INFO - Created Permission View: menu access on Permissions
[2022-11-14 07:31:01,698] {manager.py:511} ERROR - Creation of Permission View Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ab_permission_view_permission_id_view_menu_id_key"
DETAIL:  Key (permission_id, view_menu_id)=(5, 15) already exists.

[SQL: INSERT INTO ab_permission_view (id, permission_id, view_menu_id) VALUES (nextval('ab_permission_view_id_seq'), %(permission_id)s, %(view_menu_id)s) RETURNING ab_permission_view.id]
[parameters: {'permission_id': 5, 'view_menu_id': 15}]
(Background on this error at: http://sqlalche.me/e/14/gkpj)
[2022-11-14 07:31:01,698] {manager.py:511} ERROR - Creation of Permission View Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ab_permission_view_permission_id_view_menu_id_key"
DETAIL:  Key (permission_id, view_menu_id)=(5, 15) already exists.

[SQL: INSERT INTO ab_permission_view (id, permission_id, view_menu_id) VALUES (nextval('ab_permission_view_id_seq'), %(permission_id)s, %(view_menu_id)s) RETURNING ab_permission_view.id]
[parameters: {'permission_id': 5, 'view_menu_id': 15}]
(Background on this error at: http://sqlalche.me/e/14/gkpj)
[2022-11-14 07:31:01,698] {manager.py:511} ERROR - Creation of Permission View Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ab_permission_view_permission_id_view_menu_id_key"
DETAIL:  Key (permission_id, view_menu_id)=(5, 15) already exists.

[SQL: INSERT INTO ab_permission_view (id, permission_id, view_menu_id) VALUES (nextval('ab_permission_view_id_seq'), %(permission_id)s, %(view_menu_id)s) RETURNING ab_permission_view.id]
[parameters: {'permission_id': 5, 'view_menu_id': 15}]
(Background on this error at: http://sqlalche.me/e/14/gkpj)
[2022-11-14 07:31:02,891] {manager.py:568} INFO - Added Permission menu access on Permissions to role Admin
[2022-11-14 07:31:16,535] {manager.py:570} ERROR - Add Permission to Role Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ab_permission_view_role_permission_view_id_role_id_key"
DETAIL:  Key (permission_view_id, role_id)=(207, 1) already exists.

[SQL: INSERT INTO ab_permission_view_role (id, permission_view_id, role_id) VALUES (nextval('ab_permission_view_role_id_seq'), %(permission_view_id)s, %(role_id)s) RETURNING ab_permission_view_role.id]
[parameters: {'permission_view_id': 207, 'role_id': 1}]
(Background on this error at: http://sqlalche.me/e/14/gkpj)
[2022-11-14 07:31:16,535] {manager.py:570} ERROR - Add Permission to Role Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ab_permission_view_role_permission_view_id_role_id_key"
DETAIL:  Key (permission_view_id, role_id)=(207, 1) already exists.

[SQL: INSERT INTO ab_permission_view_role (id, permission_view_id, role_id) VALUES (nextval('ab_permission_view_role_id_seq'), %(permission_view_id)s, %(role_id)s) RETURNING ab_permission_view_role.id]
[parameters: {'permission_view_id': 207, 'role_id': 1}]
(Background on this error at: http://sqlalche.me/e/14/gkpj)
[2022-11-14 07:31:16,535] {manager.py:570} ERROR - Add Permission to Role Error: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ab_permission_view_role_permission_view_id_role_id_key"
DETAIL:  Key (permission_view_id, role_id)=(207, 1) already exists.

[SQL: INSERT INTO ab_permission_view_role (id, permission_view_id, role_id) VALUES (nextval('ab_permission_view_role_id_seq'), %(permission_view_id)s, %(role_id)s) RETURNING ab_permission_view_role.id]
[parameters: {'permission_view_id': 207, 'role_id': 1}]
(Background on this error at: http://sqlalche.me/e/14/gkpj)
[2022-11-14 07:32:10 +0000] [22] [CRITICAL] WORKER TIMEOUT (pid:40)
[2022-11-14 07:32:10 +0000] [22] [CRITICAL] WORKER TIMEOUT (pid:41)
[2022-11-14 07:32:10 +0000] [22] [CRITICAL] WORKER TIMEOUT (pid:42)
[2022-11-14 07:32:10 +0000] [22] [CRITICAL] WORKER TIMEOUT (pid:43)
[2022-11-14 07:32:10 +0000] [43] [INFO] Worker exiting (pid: 43)
[2022-11-14 07:32:10 +0000] [42] [INFO] Worker exiting (pid: 42)
[2022-11-14 07:32:10 +0000] [41] [INFO] Worker exiting (pid: 41)
[2022-11-14 07:32:10 +0000] [40] [INFO] Worker exiting (pid: 40)
[2022-11-14 07:32:11 +0000] [22] [WARNING] Worker with pid 42 was terminated due to signal 9
[2022-11-14 07:32:11 +0000] [22] [WARNING] Worker with pid 40 was terminated due to signal 9
[2022-11-14 07:32:11 +0000] [22] [WARNING] Worker with pid 41 was terminated due to signal 9
[2022-11-14 07:32:11 +0000] [22] [WARNING] Worker with pid 43 was terminated due to signal 9
[2022-11-14 07:32:11 +0000] [44] [INFO] Booting worker with pid: 44
[2022-11-14 07:32:12 +0000] [45] [INFO] Booting worker with pid: 45
[2022-11-14 07:32:12 +0000] [46] [INFO] Booting worker with pid: 46
[2022-11-14 07:32:12 +0000] [47] [INFO] Booting worker with pid: 47

these logs are keep on repeating. I have created postgresql externally and specified it with secrets.

What you think should happen instead

No response

How to reproduce

No response

Anything else

No response

Are you willing to submit PR?

Code of Conduct

boring-cyborg[bot] commented 2 years ago

Thanks for opening your first issue here! Be sure to follow the issue template!

ephraimbuddy commented 2 years ago

Can you try setting GUNICORN_CMD_ARGS environment variable to --preload

IDeepakI commented 2 years ago

@ephraimbuddy

[2022-11-14 14:22:23,795] {manager.py:585} INFO - Removed Permission menu access on Permissions to role Admin
[2022-11-14 14:22:27,362] {manager.py:543} INFO - Removed Permission View: menu_access on Permissions
[2022-11-14 14:22:38,215] {manager.py:508} INFO - Created Permission View: menu access on Permissions
[2022-11-14 14:22:39,752] {manager.py:568} INFO - Added Permission menu access on Permissions to role Admin
  ____________       _____________
 ____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
Running the Gunicorn Server with:
Workers: 4 sync
Host: 0.0.0.0:8080
Timeout: 120
Logfiles: - -
Access Logformat: 
=================================================================
[2022-11-14 14:23:23,887] {webserver_command.py:418} INFO - Received signal: 15. Closing gunicorn.
    Environment:
      AIRFLOW_VAR_S3_GET_BACKEND_URL:          api/v1/s3/pre_signed
      AIRFLOW_VAR_S3_UPLOAD_BACKEND_URL:       api/v1/s3/pre_signed/upload
      GUNICORN_CMD_ARGS:                       --preload --log-level DEBUG
      AIRFLOW__DATABASE__SQL_ALCHEMY_CONN:     <set to the key 'AIRFLOW__DATABASE__SQL_ALCHEMY_CONN' in secret 'airflow-metadata'>  Optional: false
      AIRFLOW__CELERY__RESULT_BACKEND:         <set to the key 'AIRFLOW__CELERY__RESULT_BACKEND' in secret 'airflow-metadata'>      Optional: false
      AIRFLOW__CORE__FERNET_KEY:               <set to the key 'fernet-key' in secret 'airflow-fernet-key'>                         Optional: false
      AIRFLOW__CORE__SQL_ALCHEMY_CONN:         <set to the key 'connection' in secret 'airflow-airflow-metadata'>                   Optional: false
      AIRFLOW_CONN_AIRFLOW_DB:                 <set to the key 'connection' in secret 'airflow-airflow-metadata'>                   Optional: false
      AIRFLOW__WEBSERVER__SECRET_KEY:          <set to the key 'webserver-secret-key' in secret 'airflow-webserver-secret-key'>     Optional: false
      AIRFLOW__CELERY__CELERY_RESULT_BACKEND:  <set to the key 'connection' in secret 'airflow-airflow-result-backend'>             Optional: false
      AIRFLOW__CELERY__BROKER_URL:             <set to the key 'connection' in secret 'airflow-broker-url'>                         Optional: false
IDeepakI commented 1 year ago

this issue got resolved when i updated livenessProbe for webserver

webserver:
    service:
      type: NodePort

    livenessProbe:
      initialDelaySeconds: 10
      timeoutSeconds: 100
      failureThreshold: 5
      periodSeconds: 600
Yogesh-1011 commented 5 months ago

Even i added above config Still I am facing same issues