lenadroid / airflow-azure

19 stars 3 forks source link

Worker node cannot write to the db #2

Closed mshankla1 closed 3 years ago

mshankla1 commented 3 years ago

Thanks for the great guide!

I'm currently following the guide with an airflow 2.0 image.

I'm running into this issue where my worker nodes crashs with the error

[2021-06-22 22:54:17,516] {cli_action_loggers.py:105} WARNING - Failed to log action with (sqlite3.OperationalError) no such table: log [SQL: INSERT INTO log (dttm, dag_id, task_id, event, execution_date, owner, extra) VALUES (?, ?, ?, ?, ?, ?, ?)] [parameters: ('2021-06-22 22:54:17.511978', 'airflow_tutorial_v01', 'print_hello', 'cli_task_run', '2021-06-22 22:47:52.924823', 'airflow', '{"host_name": "airflowtutorialv01printhello.34fc0661f3f845f29464738aa150b18f", "full_command": "[\'/home/airflow/.local/bin/airflow\', \'tasks\', \'r ... (28 characters truncated) ... \', \'print_hello\', \'2021-06-22T22:47:52.924823+00:00\', \'--local\', \'--pool\', \'default_pool\', \'--subdir\', \'/opt/airflow/dags/hello.py\']"}')]

The scheduler can connect to the postgres db verified by $: airflow db check

[2021-06-22 22:56:15,550] {db.py:776} INFO - Connection successful. and the presence of the requisite tables in postgres db having been written to.

I believe my metadata connection secrets is ok:

echo -n "postgresql+psycopg2://airflow%40{hostname}:{pwd}@{hostname}.postgres.database.azure.com:5432/airflow" | base64

Any advice on how to proceed would be wonderful, I'm a bit lost here.

Thanks!

mshankla1 commented 3 years ago

Update: This deployment isn't airflow 2.0 compatible.

The quick fix to this is to add a pod_template_file in the persistent volume and reference it under the [kubernetes] option in configmap.yaml (eg. pod_template_file = /opt/airflow/dags/pod_template_file.yaml ). Example pod_template_files are found here: https://airflow.apache.org/docs/apache-airflow/stable/executor/kubernetes.html

Here's an example pod_template_file I used:

---
apiVersion: v1
kind: Pod
metadata:
  name: dummy-name
spec:
  containers:
    - args: []
      command: []
      env:
        - name: AIRFLOW__CORE__EXECUTOR
          value: LocalExecutor
        # Hard Coded Airflow Envs
        - name: AIRFLOW__CORE__FERNET_KEY
          valueFrom:
            secretKeyRef:
              name: fernet-key
              key: fernet-key
        - name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
          valueFrom:
            secretKeyRef:
              name: airflow-metadata
              key: connection
        - name: AIRFLOW_CONN_AIRFLOW_DB
          valueFrom:
            secretKeyRef:
              name: airflow-metadata
              key: connection
      envFrom: []
      image: dummy_image
      imagePullPolicy: IfNotPresent
      name: base
      ports: []
      volumeMounts:
        - name: logs-pv
          mountPath: "/opt/airflow/logs"
        - name: dags-pv
          mountPath: "/opt/airflow/dags"
        - name: config
          mountPath: "/opt/airflow/airflow.cfg"
          subPath: airflow.cfg
          readOnly: true
  hostNetwork: false
  restartPolicy: Never
  securityContext:
    runAsUser: 50000
    fsGroup: 50000
  nodeSelector: {}
  affinity: {}
  tolerations: []
  serviceAccountName: "worker-serviceaccount"
  volumes:
    - name: config
      configMap:
        name: airflow-config
    - name: dags-pv
      persistentVolumeClaim:
        claimName: dags-pvc
    - name: logs-pv
      persistentVolumeClaim:
        claimName: logs-pvc