Open PaulienVa opened 4 months ago
There is a ConfigMap named airflow-executor-pod-template
which is used as template for the Kubernetes executors. The init container for git-sync in that template is called gitsync-0
not gitsync-1
. So if you change the pod overrides for the executors to say:
kubernetesExecutors:
...
podOverrides:
spec:
containers:
- name: gitsync-0 ### <---- this name
volumeMounts:
- name: "cert-ca"
mountPath: "/tmp/ca-cert"
things should work as expected.
The naming difference seems to be an unfortunate accident as far as I can see. :disappointed:
Hope this helps.
@razvan we do see an init container with the name gitsync-1 in the pods that are started off when kicking off a DAG, so I am afraid that that is not the solution :( Somehow the gitsync will end up in error for the kubernetesExecutors
Hm, can you share the contents of the airflow-executor-pod-template
ConfigMap?
Yes:
metadata:
labels:
app.kubernetes.io/component: executor
app.kubernetes.io/instance: airflow
app.kubernetes.io/managed-by: airflow.stackable.tech_airflowcluster
app.kubernetes.io/name: airflow
app.kubernetes.io/role-group: executor-template
app.kubernetes.io/version: 2.8.1-stackable24.3.0
stackable.tech/vendor: Stackable
spec:
affinity: {}
containers:
- env:
- name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
valueFrom:
secretKeyRef:
key: connections.sqlalchemyDatabaseUri
name: simple-airflow-credentials
- name: AIRFLOW__CORE__EXECUTOR
value: LocalExecutor
- name: AIRFLOW__KUBERNETES_EXECUTOR__NAMESPACE
value: <NAMESPACE>
- name: AIRFLOW__CORE__DAGS_FOLDER
value: /stackable/app/git/current/dags
image: docker.stackable.tech/stackable/airflow:2.8.1-stackable24.3.0
imagePullPolicy: Always
name: base
resources:
limits:
cpu: 800m
memory: 2Gi
requests:
cpu: 400m
memory: 2Gi
volumeMounts:
- mountPath: /tmp/ca-cert
name: cert-ca
- mountPath: /stackable/app/git
name: content-from-git
- mountPath: /stackable/app/config
name: config
- mountPath: /stackable/app/log_config
name: log-config
- mountPath: /stackable/log
name: log
enableServiceLinks: false
initContainers:
- args:
- /stackable/git-sync --repo=https://custom.gitlab.com/repos/airflow-dags.git --ref=main --depth=10 --period=20s --link=current --root=/tmp/git --rev=HEAD --git-config='safe.directory:/tmp/git,http.sslCAInfo:/tmp/ca-cert/ca.crt' --one-time=true
command:
- /bin/bash
- -x
- -euo
- pipefail
- -c
env:
- name: GITSYNC_USERNAME
valueFrom:
secretKeyRef:
key: user
name: gitcredentials
- name: GITSYNC_PASSWORD
valueFrom:
secretKeyRef:
key: password
name: gitcredentials
image: docker.stackable.tech/stackable/airflow:2.8.1-stackable24.3.0
imagePullPolicy: Always
name: gitsync-1
resources:
limits:
cpu: 200m
memory: 64Mi
requests:
cpu: 100m
memory: 64Mi
volumeMounts:
- mountPath: /tmp/git
name: content-from-git
restartPolicy: Never
securityContext:
fsGroup: 1000
runAsGroup: 0
runAsUser: 1000
serviceAccountName: airflow-serviceaccount
terminationGracePeriodSeconds: 300
volumes:
- configMap:
items:
- key: ca.crt
path: ca.crt
name: cert-ca
name: cert-ca
- configMap:
name: airflow-executor-kubernetes
name: config
- emptyDir:
sizeLimit: 30Mi
name: log
- configMap:
name: airflow-executor-kubernetes
name: log-config
- emptyDir: {}
name: content-from-git
It looks like you are using SDP 24.3.0 which doesn't contain the fix from #456
app.kubernetes.io/version: 2.8.1-stackable24.3.0
To take advantage of that fix, you need to use the development version of the Airflow operator as suggested or wait until 24.7 is out. This should happen in the next couple of weeks.
Thank you again for your report.
Having looked a bit longer at your use case I created a follow-up issue that should simplify things a lot in the future.
Regarding the upcoming operator version, the correct way to mount the CA in the kubernetes
executor pods will be:
kubernetesExecutors:
...
podOverrides:
spec:
initContainers: ### not a side-car container in this case
- name: gitsync-0 ### init containers are called "gitsync-0". Side containers are called "gitsync-1"
volumeMounts:
- name: "cert-ca"
mountPath: "/tmp/ca-cert"
For celery
executor pods, both the init and the side-car containers must be patched like so:
celeryExecutors:
...
podOverrides:
spec:
initContainers:
- name: gitsync-0
volumeMounts:
- name: "cert-ca"
mountPath: "/tmp/ca-cert"
containers:
- name: gitsync-1
volumeMounts:
- name: "cert-ca"
mountPath: "/tmp/ca-cert"
Hope this helps and sorry for the trouble.
Okay I will test this with this config and let you know. Thank you for diving into it
Affected Stackable version
Nightly
Affected Apache Airflow version
2.8.1
Current and expected behavior
Using this AirflowCluster:
The
gitSyncConf
does not seem to get propagated for the kubernetesExecutors. Also the git-sync container has aone-time=true
flag set for the git-sync commandFollow up this MR: https://github.com/stackabletech/airflow-operator/pull/456#issuecomment-2230116710
Expected behavior: the webserver, scheduler and kubernetesExecutors are configured exactly the same for the git-sync containers
Possible solution
No response
Additional context
No response
Environment
No response
Would you like to work on fixing this bug?
None