apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.12k stars 14.31k forks source link

Values in SparkKubernetesOperator YAML needs to be defined #41562

Open amirshabanics opened 2 months ago

amirshabanics commented 2 months ago

Apache Airflow Provider(s)

cncf-kubernetes

Versions of Apache Airflow Providers

apache-airflow-providers-cncf-kubernetes==8.3.1

Apache Airflow version

2.9.2

Operating System

Ubuntu 24.04 LTS

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

What happened

I want to create a SparkKubernetesOperator task. So I make a manifest and pass it from params. My manifest:

spark:
# spark spec
kubernetes:
  env_vars: []
# other kubernetes spec

If I don't set env_vars, it raises an error:

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py", line 465, in _execute_task
    result = _execute_callable(context=context, **execute_callable_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py", line 432, in _execute_callable
    return execute_callable(context=context, **execute_callable_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/models/baseoperator.py", line 401, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py", line 283, in execute
    self.launcher = CustomObjectLauncher(
                    ^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/custom_object_launcher.py", line 221, in __init__
    self.body: dict = self.get_body()
                      ^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/custom_object_launcher.py", line 242, in get_body
    k8s_spec: dict = KubernetesSpec(**self.template_body["kubernetes"])
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/custom_object_launcher.py", line 84, in __init__
    self.set_attribute()
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/custom_object_launcher.py", line 87, in set_attribute
    self.env_vars = convert_env_vars(self.env_vars) if self.env_vars else []
                                                       ^^^^^^^^^^^^^
AttributeError: 'KubernetesSpec' object has no attribute 'env_vars'

It can be happened for all variables in kubernetes spec.

What you think should happen instead

This variable has a default value.

How to reproduce

Just remove the line that env_vars exists.

Anything else

No response

Are you willing to submit PR?

Code of Conduct

boring-cyborg[bot] commented 2 months ago

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

gopidesupavan commented 1 month ago

If you dont need env_vars, then the kubernetes block its self is not required? are you passing any other variables under kubernetes except env_vars? if none of the variables your using you can simply remove the block it should work.

Flametaa commented 1 week ago

+1 on this. I also think that we should add a documentation on how the template_spec should be structured. It's not really user friendly right now. I am willing to contribute to this if needed.

amirshabanics commented 1 week ago

If you dont need env_vars, then the kubernetes block its self is not required? are you passing any other variables under kubernetes except env_vars? if none of the variables your using you can simply remove the block it should work.

You can try. If I remove any of the Kubernetes variables raise an exception.