stackabletech / airflow-operator

Stackable Operator for Apache Airflow
Other
22 stars 2 forks source link

Git-sync with kubernetesExecutors and using https #468

Open PaulienVa opened 4 months ago

PaulienVa commented 4 months ago

Affected Stackable version

Nightly

Affected Apache Airflow version

2.8.1

Current and expected behavior

Using this AirflowCluster:

---
apiVersion: airflow.stackable.tech/v1alpha1
kind: AirflowCluster
metadata:
  name: airflow
spec:
  image:
    productVersion: 2.8.1
  clusterConfig:
    loadExamples: false
    exposeConfig: false
    listenerClass: external-unstable
    credentialsSecret: simple-airflow-credentials
    dagsGitSync: 
    - repo: https://custom.gitlab.com/repos/airflow-dags.git
      branch: "main" 
      gitFolder: "dags" 
      depth: 10 
      wait: 20 
      credentialsSecret: gitcredentials
      gitSyncConf: 
        --rev: HEAD 
        --git-config: http.sslCAInfo:/tmp/ca-cert/ca.crt
    volumeMounts:
    - name: "cert-ca"
      mountPath: "/tmp/ca-cert"  
    volumes:
    - name: "cert-ca"
      configMap:
        name: "cert-ca"
        items: 
        - key: "ca.crt"
          path: "ca.crt"
  webservers:
    roleGroups:
      default:
        replicas: 1
    podOverrides:
      spec:
        containers:
          - name: gitsync-1
            volumeMounts:
            - name: "cert-ca"
              mountPath: "/tmp/ca-cert"  
  kubernetesExecutors:
    config:
      resources:
        cpu:
          min: 400m
          max: 800m
        memory:
          limit: 2Gi
    podOverrides:
      spec:
        containers:
          - name: gitsync-1
            volumeMounts:
              - name: "cert-ca"
                mountPath: "/tmp/ca-cert"
  schedulers:
    roleGroups:
      default:
        replicas: 1
    podOverrides:
      spec:
        containers:
          - name: gitsync-1
            volumeMounts:
            - name: "cert-ca"
              mountPath: "/tmp/ca-cert"

The gitSyncConf does not seem to get propagated for the kubernetesExecutors. Also the git-sync container has a one-time=true flag set for the git-sync command

Follow up this MR: https://github.com/stackabletech/airflow-operator/pull/456#issuecomment-2230116710

Expected behavior: the webserver, scheduler and kubernetesExecutors are configured exactly the same for the git-sync containers

Possible solution

No response

Additional context

No response

Environment

No response

Would you like to work on fixing this bug?

None

razvan commented 4 months ago

There is a ConfigMap named airflow-executor-pod-template which is used as template for the Kubernetes executors. The init container for git-sync in that template is called gitsync-0 not gitsync-1. So if you change the pod overrides for the executors to say:

  kubernetesExecutors:
...
    podOverrides:
      spec:
        containers:
          - name: gitsync-0 ### <---- this name
            volumeMounts:
              - name: "cert-ca"
                mountPath: "/tmp/ca-cert"

things should work as expected.

The naming difference seems to be an unfortunate accident as far as I can see. :disappointed:

Hope this helps.

PaulienVa commented 4 months ago

@razvan we do see an init container with the name gitsync-1 in the pods that are started off when kicking off a DAG, so I am afraid that that is not the solution :( Somehow the gitsync will end up in error for the kubernetesExecutors

razvan commented 4 months ago

Hm, can you share the contents of the airflow-executor-pod-template ConfigMap?

PaulienVa commented 4 months ago

Yes:


metadata:
  labels:
   app.kubernetes.io/component: executor
   app.kubernetes.io/instance: airflow
   app.kubernetes.io/managed-by: airflow.stackable.tech_airflowcluster
   app.kubernetes.io/name: airflow
   app.kubernetes.io/role-group: executor-template
   app.kubernetes.io/version: 2.8.1-stackable24.3.0
   stackable.tech/vendor: Stackable

spec:
  affinity: {}
  containers:
  - env:
   - name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
     valueFrom:
       secretKeyRef:
         key: connections.sqlalchemyDatabaseUri
         name: simple-airflow-credentials
   - name: AIRFLOW__CORE__EXECUTOR
     value: LocalExecutor
   - name: AIRFLOW__KUBERNETES_EXECUTOR__NAMESPACE
     value: <NAMESPACE>
   - name: AIRFLOW__CORE__DAGS_FOLDER
     value: /stackable/app/git/current/dags
   image: docker.stackable.tech/stackable/airflow:2.8.1-stackable24.3.0
   imagePullPolicy: Always
   name: base
   resources:
     limits:
       cpu: 800m
       memory: 2Gi
     requests:
       cpu: 400m
       memory: 2Gi
   volumeMounts:
   - mountPath: /tmp/ca-cert
     name: cert-ca
   - mountPath: /stackable/app/git
     name: content-from-git
   - mountPath: /stackable/app/config
     name: config
   - mountPath: /stackable/app/log_config
     name: log-config
   - mountPath: /stackable/log
     name: log
  enableServiceLinks: false
  initContainers:
  - args:
   - /stackable/git-sync --repo=https://custom.gitlab.com/repos/airflow-dags.git --ref=main --depth=10 --period=20s --link=current --root=/tmp/git --rev=HEAD --git-config='safe.directory:/tmp/git,http.sslCAInfo:/tmp/ca-cert/ca.crt' --one-time=true

   command:
   - /bin/bash
   - -x
   - -euo
   - pipefail
   - -c
   env:
   - name: GITSYNC_USERNAME
     valueFrom:
       secretKeyRef:
         key: user
         name: gitcredentials
   - name: GITSYNC_PASSWORD
     valueFrom:
       secretKeyRef:
         key: password
         name: gitcredentials
   image: docker.stackable.tech/stackable/airflow:2.8.1-stackable24.3.0
   imagePullPolicy: Always
   name: gitsync-1

   resources:
     limits:
       cpu: 200m
       memory: 64Mi
     requests:
       cpu: 100m
       memory: 64Mi

   volumeMounts:
   - mountPath: /tmp/git
     name: content-from-git
  restartPolicy: Never
  securityContext:
   fsGroup: 1000
   runAsGroup: 0
   runAsUser: 1000
  serviceAccountName: airflow-serviceaccount
  terminationGracePeriodSeconds: 300
  volumes:

  - configMap:
     items:
     - key: ca.crt
       path: ca.crt
     name: cert-ca
   name: cert-ca
  - configMap:
     name: airflow-executor-kubernetes
   name: config
  - emptyDir:
     sizeLimit: 30Mi
   name: log
  - configMap:
     name: airflow-executor-kubernetes
   name: log-config
  - emptyDir: {}
   name: content-from-git
razvan commented 4 months ago

It looks like you are using SDP 24.3.0 which doesn't contain the fix from #456

app.kubernetes.io/version: 2.8.1-stackable24.3.0

To take advantage of that fix, you need to use the development version of the Airflow operator as suggested or wait until 24.7 is out. This should happen in the next couple of weeks.

razvan commented 4 months ago

Thank you again for your report.

Having looked a bit longer at your use case I created a follow-up issue that should simplify things a lot in the future.

Regarding the upcoming operator version, the correct way to mount the CA in the kubernetes executor pods will be:

  kubernetesExecutors:
...
    podOverrides:
      spec:
        initContainers:      ###  not a side-car container in this case
          - name: gitsync-0  ###  init containers are called "gitsync-0". Side containers are called "gitsync-1"
            volumeMounts:
              - name: "cert-ca"
                mountPath: "/tmp/ca-cert"

For celery executor pods, both the init and the side-car containers must be patched like so:

  celeryExecutors:
...
    podOverrides:
      spec:
        initContainers:
          - name: gitsync-0
            volumeMounts:
              - name: "cert-ca"
                mountPath: "/tmp/ca-cert"
        containers: 
          - name: gitsync-1
            volumeMounts:
              - name: "cert-ca"
                mountPath: "/tmp/ca-cert"

Hope this helps and sorry for the trouble.

PaulienVa commented 4 months ago

Okay I will test this with this config and let you know. Thank you for diving into it