Closed macwro closed 3 years ago
I suspect this has something to do with the workaround with the empty config. As of 8.0.2, the configuration is still needed and the secret is not used. A workaround that fixes both of these issues is to decode the secret and create a ConfigMap with that values, with the name <release-name>-env
.
You can use the following for decoding the secret:
kubectl get secret name-of-secret -o go-template='
{{range $k,$v := .data}}{{printf "%s: " $k}}{{if not $v}}{{$v}}{{else}}{{$v | base64decode}}{{end}}{{"\n"}}{{end}}'
Creating a configmap with these contents seems to fix the issue. If https://github.com/airflow-helm/charts/pull/122 is merged, it should also fix the issue I believe.
@macwro can you confirm if this issue is fixed after version 8.0.3
of the chart?
I encountered this issue with 8.0.5 of the chart. The default PodTemplate refers to a configmap airflow-env
. This doesn't exist, so the default settings of the Chart are broken. Creating an empty airflow-env
configmap won't work, because the worker pods will require the same database config as the scheduler/web pods to communicate with the backend db. So you should refer to the secret airflow-config
in the same way as the scheduler/web deployments.
However, that leads me to a BACKEND: unbound variable
error in the entrypoint script.
@macwro can you confirm if this issue is fixed after version
8.0.3
of the chart?
I can confirm that on version 8.0.5 that problem does not exist.
Thanks for fixing it !
@rolanddb can you please check again, as 8.0.5
seems to correctly reference airflow-config
https://github.com/airflow-helm/charts/blob/airflow-8.0.5/charts/airflow/files/pod_template.kubernetes-helm-yaml#L44-L46
@thesuperzapper I double checked, but I'm still getting that BACKEND: unbound variable
error in the entrypoint script (using the Dockerfile from the apache/airflow repo). The script triest to be a little too smart for my taste, e.g. it uses a regex to parse the SQLAlchemy string to get the host/port and do connectivity tests on that. I actually tried the regex and it does capture the various elements but somewhere beyond that, it fails. I've modified the entrypoint to exclude some of the checks. Airflow is up and running now on my cluster. Thanks for the help.
@rolanddb we override the entrypoint for all the pods.
What are you doing that is running the airflow/airflow
Dockerfile entrypoint?
@thesuperzapper Are we looking at the same thing? I don't see anything specifying the entrypoint (I checked scheduler, e.g. https://github.com/airflow-helm/charts/blob/main/charts/airflow/templates/scheduler/scheduler-deployment.yaml#L98-L104 and the pod template). All that is being specified is the command and args.
@rolanddb command
overrides the Dockefile ENTRYPOINT
, see the Kuberntes API docs: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#container-v1-core
@thesuperzapper Thanks for pointing that out! I wasn't aware that Kubernetes uses the same terminology (command/entrypoint) for overlapping functionality but in a different way. Super confusing but now I am aware of it.
Back to the issue: you are right that the entrypoint is overridden for the scheduler/web deployments. But as far as I can tell, the default pod template does not override the command. See https://github.com/airflow-helm/charts/blob/main/charts/airflow/files/pod_template.kubernetes-helm-yaml#L56-L57
So my example (scheduler deployment that I linked above) was wrong but I think that the issue still stands. I have added an echo
statement in the entrypoint script and that is being printed when a worker pod is running. So if the container image contains an entrypoint that depends on some ENV vars that are not available, things will not work. I think this is the case for the default settings of this Helm charts right now.
@rolanddb can you clarify if 8.0.6
still has whatever issue you where raising in this issue?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
What is the bug? On newest version 8.0.2 of airflow helm chart, running on top of AWS EKS cluster with local Postrges DB and S3 bucket as logs locations, I get error “cannot use sqlite with the LocalExecutor” when pods starts to execute DAG task. POD has status : “Error”. Log from failed POD:
Problem exist for custom and examples DAGs. I see that “Local Executor” is set by: charts/airflow/files/pod_template.kubernetes-helm-yaml I made a test and updated “airflow2-pod-template” ConfigMap with “KubernetesExecutor” value for “AIRFLOWCOREEXECUTOR”. Unfortunately result was negative. Error still exist but reports for “cannot use sqlite with the KubernetesExecutor”.
I also had to implemented workaround from thread: https://github.com/airflow-helm/charts/issues/119 to allow PODs to start.
What are your Helm values?
What is your Kubernetes Version?:
What is your Helm version?: