Open dmndru opened 6 months ago
Since you're running within the cluster already, could you just use a regular KubernetesPodOperator?
I'm not super familiar with the GKE-specific operators, but I believe they go through the public-internet GKE APIs rather than just talking to the cluster-local API server, hence the additional IAM role requirement.
Yes, the GKEStartPodOperator goes through the public endpoint. Unfortunately, we can't use the KubernetesPodOperator since we also need to create pods in other clusters.
Fixed by adding the https://www.googleapis.com/auth/userinfo.email
scope.
{
"conn_type": "google_cloud_platform",
"extra": {
"extra__google_cloud_platform__scope": "https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/userinfo.email"
}
}
Apache Airflow Provider(s)
google
Versions of Apache Airflow Providers
apache-airflow-providers-cncf-kubernetes 8.0.0 apache-airflow-providers-google 10.15.0
Apache Airflow version
2.7.3
Operating System
Debian 11
Deployment
Official Apache Airflow Helm Chart
Deployment details
GKE cluster version 1.26.13
What happened
We are using the GKEStartPodOperator to run a pod in our GKE clusters and getting the error:
trace
[2024-02-27, 10:48:33 UTC] {pod_manager.py:329} ERROR - Exception when attempting to create Namespaced Pod: { "apiVersion": "v1", "kind": "Pod", "metadata": { "annotations": {}, "labels": { "tier": "staging", "dag_id": "batch", "task_id": "start", "run_id": "manual__2024-02-27T104509.7089670000-57e30e5c0", "kubernetes_pod_operator": "True", "try_number": "1", "airflow_version": "2.7.3", "airflow_kpo_in_cluster": "False" }, "name": "batch-9ef11196", "namespace": "staging" }, "spec": { "affinity": {}, "containers": [ { "args": [ "--name", "batch", "--batch_pg_run_id", "manual__2024-02-27T10:45:09.708967+00:00", "--batch_pg", "input/", "--batch_pg_use_proxy", "True", "--batch_pg_dag_execution_date", "2024-02-27", "--batch_pg_always_use_selenium", "False", "--batch_pg_selenium_on_scrapy_error", "False", "--batch_pg_batch_size", "1000", "--batch_pg", "False" ], "command": [], "env": [], "envFrom": [], "image": "staging_latest", "imagePullPolicy": "Always", "name": "base", "ports": [], "terminationMessagePolicy": "File", "volumeMounts": [] } ], "hostNetwork": false, "initContainers": [], "restartPolicy": "Never", "securityContext": {}, "volumes": [] } } Traceback (most recent call last): File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py", line 324, in run_pod_async resp = self._client.create_namespaced_pod( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api/core_v1_api.py", line 7356, in create_namespaced_pod return self.create_namespaced_pod_with_http_info(namespace, body, **kwargs) # noqa: E501 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api/core_v1_api.py", line 7455, in create_namespaced_pod_with_http_info return self.api_client.call_api( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 348, in call_api return self.__call_api(resource_path, method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 180, in __call_api response_data = self.request( ^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 391, in request return self.rest_client.POST(url, ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 279, in POST return self.request("POST", url, ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 238, in request raise ApiException(http_resp=r) kubernetes.client.exceptions.ApiException: (403) Reason: Forbidden HTTP response headers: HTTPHeaderDict({'Audit-Id': '677bd23e-3885-4057-84cb-cbcfd8bcb4d2', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '43677b2c-e54a-4a50-ae9f-79d579f5d98c', 'X-Kubernetes-Pf-Prioritylevel-Uid': '9ac4bb3e-9b09-4aca-912b-6b01d3f002b1', 'Date': 'Tue, 27 Feb 2024 10:48:33 GMT', 'Content-Length': '337'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods is forbidden: User \"7843619910037672257843\" cannot create resource \"pods\" in API group \"\" in the namespace \"staging\": requires one of [\"container.pods.create\"] permission(s).","reason":"Forbidden","details":{"kind":"pods"},"code":403} [2024-02-27, 10:48:33 UTC] {pod.py:1109} ERROR - 'NoneType' object has no attribute 'metadata' Traceback (most recent call last): File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 578, in execute_sync self.pod = self.get_or_create_pod( # must set `self.pod` for `on_kill` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 538, in get_or_create_pod self.pod_manager.create_pod(pod=pod_request_obj) File "/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 289, in wrapped_f return self(f, *args, **kw) ^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 379, in __call__ do = self.iter(retry_state=retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 314, in iter return fut.result() ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result raise self._exception File "/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 382, in __call__ result = fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py", line 354, in create_pod return self.run_pod_async(pod) ^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py", line 332, in run_pod_async raise e File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py", line 324, in run_pod_async resp = self._client.create_namespaced_pod( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api/core_v1_api.py", line 7356, in create_namespaced_pod return self.create_namespaced_pod_with_http_info(namespace, body, **kwargs) # noqa: E501 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api/core_v1_api.py", line 7455, in create_namespaced_pod_with_http_info return self.api_client.call_api( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 348, in call_api return self.__call_api(resource_path, method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 180, in __call_api response_data = self.request( ^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 391, in request return self.rest_client.POST(url, ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 279, in POST return self.request("POST", url, ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 238, in request raise ApiException(http_resp=r) kubernetes.client.exceptions.ApiException: (403) Reason: Forbidden HTTP response headers: HTTPHeaderDict({'Audit-Id': '677bd23e-3885-4057-84cb-cbcfd8bcb4d2', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '43677b2c-e54a-4a50-ae9f-79d579f5d98c', 'X-Kubernetes-Pf-Prioritylevel-Uid': '9ac4bb3e-9b09-4aca-912b-6b01d3f002b1', 'Date': 'Tue, 27 Feb 2024 10:48:33 GMT', 'Content-Length': '337'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods is forbidden: User \"7843619910037672257843\" cannot create resource \"pods\" in API group \"\" in the namespace \"staging\": requires one of [\"container.pods.create\"] permission(s).","reason":"Forbidden","details":{"kind":"pods"},"code":403} During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 937, in patch_already_checked name=pod.metadata.name, ^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'metadata' [2024-02-27, 10:48:33 UTC] {taskinstance.py:1937} ERROR - Task failed with exception Traceback (most recent call last): File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 578, in execute_sync self.pod = self.get_or_create_pod( # must set `self.pod` for `on_kill` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 538, in get_or_create_pod self.pod_manager.create_pod(pod=pod_request_obj) File "/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 289, in wrapped_f return self(f, *args, **kw) ^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 379, in __call__ do = self.iter(retry_state=retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 314, in iter return fut.result() ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result raise self._exception File "/home/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 382, in __call__ result = fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py", line 354, in create_pod return self.run_pod_async(pod) ^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py", line 332, in run_pod_async raise e File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py", line 324, in run_pod_async resp = self._client.create_namespaced_pod( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api/core_v1_api.py", line 7356, in create_namespaced_pod return self.create_namespaced_pod_with_http_info(namespace, body, **kwargs) # noqa: E501 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api/core_v1_api.py", line 7455, in create_namespaced_pod_with_http_info return self.api_client.call_api( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 348, in call_api return self.__call_api(resource_path, method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 180, in __call_api response_data = self.request( ^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 391, in request return self.rest_client.POST(url, ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 279, in POST return self.request("POST", url, ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/kubernetes/client/rest.py", line 238, in request raise ApiException(http_resp=r) kubernetes.client.exceptions.ApiException: (403) Reason: Forbidden HTTP response headers: HTTPHeaderDict({'Audit-Id': '677bd23e-3885-4057-84cb-cbcfd8bcb4d2', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '43677b2c-e54a-4a50-ae9f-79d579f5d98c', 'X-Kubernetes-Pf-Prioritylevel-Uid': '9ac4bb3e-9b09-4aca-912b-6b01d3f002b1', 'Date': 'Tue, 27 Feb 2024 10:48:33 GMT', 'Content-Length': '337'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods is forbidden: User \"7843619910037672257843\" cannot create resource \"pods\" in API group \"\" in the namespace \"staging\": requires one of [\"container.pods.create\"] permission(s).","reason":"Forbidden","details":{"kind":"pods"},"code":403} During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/google/cloud/operators/kubernetes_engine.py", line 548, in execute return super().execute(context) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 570, in execute return self.execute_sync(context) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 629, in execute_sync self.cleanup( File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 839, in cleanup raise AirflowException( airflow.exceptions.AirflowException: Pod batch-pagegrabber-crawler-9ef11196 returned a failure. remote_pod: None [2024-02-27, 10:48:33 UTC] {taskinstance.py:1400} INFO - Marking task as FAILED. dag_id=batch, task_id=start, execution_date=20240227T104509, start_date=20240227T104831, end_date=20240227T104833 [2024-02-27, 10:48:33 UTC] {standard_task_runner.py:104} ERROR - Failed to execute job 19 for task start (Pod batch-9ef11196 returned a failure. remote_pod: None; 24)What you think should happen instead
No response
How to reproduce
Anything else
The error could be fixed by granting the Kubernetes Engine Developer to the service account, but it is cluster-wide, and we need to grant permissions to a single namespace.
Are you willing to submit PR?
Code of Conduct