apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.26k stars 14.33k forks source link

Scheduler has no Kubernetes API access when disabling API token automounting #43464

Open DjVinnii opened 3 weeks ago

DjVinnii commented 3 weeks ago

Official Helm Chart version

1.15.0 (latest released)

Apache Airflow version

2.9.3

Kubernetes Version

1.29.7

Helm Chart configuration

No response

Docker Image customizations

No response

What happened

The Airflow scheduler requires Kubernetes API access. When disabling automountServiceAccountToken, the API token is not mounted in the pod(s) resulting in a CrashLoopBackOff with the following error:

+ airflow-scheduler-5944fd4567-6jtq7 › scheduler
airflow-scheduler-5944fd4567-6jtq7 scheduler
airflow-scheduler-5944fd4567-6jtq7 scheduler /home/airflow/.local/lib/python3.12/site-packages/airflow/metrics/statsd_logger.py:184 RemovedInAirflow3Warning: The basic metric validator will be deprecated in the future in favor of pattern-matching.  You can try this now by setting config option metrics_use_pattern_match to True.
airflow-scheduler-5944fd4567-6jtq7 scheduler   ____________       _____________
airflow-scheduler-5944fd4567-6jtq7 scheduler  ____    |__( )_________  __/__  /________      __
airflow-scheduler-5944fd4567-6jtq7 scheduler ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
airflow-scheduler-5944fd4567-6jtq7 scheduler ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
airflow-scheduler-5944fd4567-6jtq7 scheduler  _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
airflow-scheduler-5944fd4567-6jtq7 scheduler [2024-10-29T09:29:38.255+0000] {task_context_logger.py:63} INFO - Task context logging is enabled
airflow-scheduler-5944fd4567-6jtq7 scheduler /home/airflow/.local/lib/python3.12/site-packages/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py:143 FutureWarning: The config section [kubernetes] has been renamed to [kubernetes_executor]. Please update your `conf.get*` call to use the new name
airflow-scheduler-5944fd4567-6jtq7 scheduler [2024-10-29T09:29:38.342+0000] {executor_loader.py:235} INFO - Loaded executor: KubernetesExecutor
airflow-scheduler-5944fd4567-6jtq7 scheduler [2024-10-29T09:29:38.675+0000] {scheduler_job_runner.py:799} INFO - Starting the scheduler
airflow-scheduler-5944fd4567-6jtq7 scheduler [2024-10-29T09:29:38.676+0000] {scheduler_job_runner.py:806} INFO - Processing each file at most -1 times
airflow-scheduler-5944fd4567-6jtq7 scheduler [2024-10-29T09:29:38.676+0000] {kubernetes_executor.py:287} INFO - Start Kubernetes executor
airflow-scheduler-5944fd4567-6jtq7 scheduler [2024-10-29T09:29:38.681+0000] {scheduler_job_runner.py:863} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
airflow-scheduler-5944fd4567-6jtq7 scheduler Traceback (most recent call last):
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/jobs/scheduler_job_runner.py", line 837, in _execute
airflow-scheduler-5944fd4567-6jtq7 scheduler     self.job.executor.start()
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py", line 295, in start
airflow-scheduler-5944fd4567-6jtq7 scheduler     self.kube_client = get_kube_client()
airflow-scheduler-5944fd4567-6jtq7 scheduler                        ^^^^^^^^^^^^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/cncf/kubernetes/kube_client.py", line 129, in get_kube_client
airflow-scheduler-5944fd4567-6jtq7 scheduler     config.load_incluster_config(client_configuration=configuration)
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/kubernetes/config/incluster_config.py", line 121, in load_incluster_config
airflow-scheduler-5944fd4567-6jtq7 scheduler     try_refresh_token=try_refresh_token).load_and_set(client_configuration)
airflow-scheduler-5944fd4567-6jtq7 scheduler                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/kubernetes/config/incluster_config.py", line 54, in load_and_set
airflow-scheduler-5944fd4567-6jtq7 scheduler     self._load_config()
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/kubernetes/config/incluster_config.py", line 73, in _load_config
airflow-scheduler-5944fd4567-6jtq7 scheduler     raise ConfigException("Service token file does not exist.")
airflow-scheduler-5944fd4567-6jtq7 scheduler kubernetes.config.config_exception.ConfigException: Service token file does not exist.
airflow-scheduler-5944fd4567-6jtq7 scheduler [2024-10-29T09:29:38.731+0000] {kubernetes_executor.py:745} INFO - Shutting down Kubernetes executor
airflow-scheduler-5944fd4567-6jtq7 scheduler [2024-10-29T09:29:38.740+0000] {manager.py:321} WARNING - Ending without manager process.
airflow-scheduler-5944fd4567-6jtq7 scheduler [2024-10-29T09:29:38.740+0000] {scheduler_job_runner.py:875} INFO - Exited execute loop
airflow-scheduler-5944fd4567-6jtq7 scheduler Traceback (most recent call last):
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/bin/airflow", line 8, in <module>
airflow-scheduler-5944fd4567-6jtq7 scheduler     sys.exit(main())
airflow-scheduler-5944fd4567-6jtq7 scheduler              ^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/__main__.py", line 58, in main
airflow-scheduler-5944fd4567-6jtq7 scheduler     args.func(args)
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/cli/cli_config.py", line 49, in command
airflow-scheduler-5944fd4567-6jtq7 scheduler     return func(*args, **kwargs)
airflow-scheduler-5944fd4567-6jtq7 scheduler            ^^^^^^^^^^^^^^^^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/utils/cli.py", line 114, in wrapper
airflow-scheduler-5944fd4567-6jtq7 scheduler     return f(*args, **kwargs)
airflow-scheduler-5944fd4567-6jtq7 scheduler            ^^^^^^^^^^^^^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/utils/providers_configuration_loader.py", line 55, in wrapped_function
airflow-scheduler-5944fd4567-6jtq7 scheduler     return func(*args, **kwargs)
airflow-scheduler-5944fd4567-6jtq7 scheduler            ^^^^^^^^^^^^^^^^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/cli/commands/scheduler_command.py", line 58, in scheduler
airflow-scheduler-5944fd4567-6jtq7 scheduler     run_command_with_daemon_option(
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/cli/commands/daemon_utils.py", line 85, in run_command_with_daemon_option
airflow-scheduler-5944fd4567-6jtq7 scheduler     callback()
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/cli/commands/scheduler_command.py", line 61, in <lambda>
airflow-scheduler-5944fd4567-6jtq7 scheduler     callback=lambda: _run_scheduler_job(args),
airflow-scheduler-5944fd4567-6jtq7 scheduler                      ^^^^^^^^^^^^^^^^^^^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/cli/commands/scheduler_command.py", line 49, in _run_scheduler_job
airflow-scheduler-5944fd4567-6jtq7 scheduler     run_job(job=job_runner.job, execute_callable=job_runner._execute)
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/utils/session.py", line 79, in wrapper
airflow-scheduler-5944fd4567-6jtq7 scheduler     return func(*args, session=session, **kwargs)
airflow-scheduler-5944fd4567-6jtq7 scheduler            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/jobs/job.py", line 402, in run_job
airflow-scheduler-5944fd4567-6jtq7 scheduler     return execute_job(job, execute_callable=execute_callable)
airflow-scheduler-5944fd4567-6jtq7 scheduler            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/jobs/job.py", line 431, in execute_job
airflow-scheduler-5944fd4567-6jtq7 scheduler     ret = execute_callable()
airflow-scheduler-5944fd4567-6jtq7 scheduler           ^^^^^^^^^^^^^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/jobs/scheduler_job_runner.py", line 837, in _execute
airflow-scheduler-5944fd4567-6jtq7 scheduler     self.job.executor.start()
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py", line 295, in start
airflow-scheduler-5944fd4567-6jtq7 scheduler     self.kube_client = get_kube_client()
airflow-scheduler-5944fd4567-6jtq7 scheduler                        ^^^^^^^^^^^^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/cncf/kubernetes/kube_client.py", line 129, in get_kube_client
airflow-scheduler-5944fd4567-6jtq7 scheduler     config.load_incluster_config(client_configuration=configuration)
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/kubernetes/config/incluster_config.py", line 121, in load_incluster_config
airflow-scheduler-5944fd4567-6jtq7 scheduler     try_refresh_token=try_refresh_token).load_and_set(client_configuration)
airflow-scheduler-5944fd4567-6jtq7 scheduler                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/kubernetes/config/incluster_config.py", line 54, in load_and_set
airflow-scheduler-5944fd4567-6jtq7 scheduler     self._load_config()
airflow-scheduler-5944fd4567-6jtq7 scheduler   File "/home/airflow/.local/lib/python3.12/site-packages/kubernetes/config/incluster_config.py", line 73, in _load_config
airflow-scheduler-5944fd4567-6jtq7 scheduler     raise ConfigException("Service token file does not exist.")
airflow-scheduler-5944fd4567-6jtq7 scheduler kubernetes.config.config_exception.ConfigException: Service token file does not exist.

What you think should happen instead

In my opinion the Airflow Helm Chart should provision the token so the scheduler sucessfully runs on the Kubernetes cluster.

How to reproduce

Set scheduler.serviceAccount.automountServiceAccountToken: false

Anything else

No response

Are you willing to submit PR?

Code of Conduct

potiuk commented 3 weeks ago

I believe Airflow scheduler does not require the token - it requires it when K8S executor is used, but when you use local or celery executor it should work fine

I am not sure however what was the intetion - it's been added everywhere in https://github.com/apache/airflow/pull/32808 and maybe @amoghrajesh can comment on it, but maybe you already know how to change it - seems that you want to submit a PR for it?

DjVinnii commented 3 weeks ago

I believe Airflow scheduler does not require the token - it requires it when K8S executor is used, but when you use local or celery executor it should work fine

Ah yes, I forgot to mention that I'm indeed using the K8S executor. I have to disable to K8S Service Account token automount due to a cluster policy and suspect that this might be the case for more users.

I am not sure however what was the intetion - it's been added everywhere in #32808 and maybe @amoghrajesh can comment on it, but maybe you already know how to change it - seems that you want to submit a PR for it?

I am indeed willing to submit a PR, however I don't know what the best way will be to solve this. Maybe @amoghrajesh has some insights on this.

amoghrajesh commented 3 weeks ago

@potiuk @DjVinnii I checked the issue. The fix was to complete this one https://github.com/apache/airflow/issues/30722. The idea was to not mount the service account tokens to reduce the security risk of the token being exposed if a pod is compromised.

On further reading, I see that the token is always needed for scheduler and if this is set to false, the serviceaccount token will not be automatically mounted into the pods that use this service account (scheduler for example). The scheduler will not be able to authenticate to the K8s API, which is needed for tasks like creating and managing pods.

I am indeed willing to submit a PR, however I don't know what the best way will be to solve this. Maybe @amoghrajesh has some insights on this.

@DjVinnii I think the most ideal fix here would be to remove the option from the scheduler service account. It can be optional for other pods but it is always supposed to be true for scheduler.

amoghrajesh commented 3 weeks ago

@DjVinnii feel free to submit a PR that implements this logic