Closed sgomezf closed 1 year ago
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
PR with change in functionality: https://github.com/apache/airflow/pull/29266 This upgrade was from 2.5.1 to 2.6.1
Indeed, there is a problem in the provider documentation which that should have been updated in the version 9.0.0.
For the main issue, I think there is no way in the current implementation to force the operator/hook to use the private endpoint, where the cluster information are fetched via the GCP python client using the project id and the cluster name.
I'll open a PR to fix this.
@sgomezf could you test if #31391 resolves your problem? you can install the provider from the PR branch or simply copy the operator code with changes I made in a new module, then use the copied operator instead of the provider one.
I'm happy to confirm that now I can create pods with GKEStartPodOperator again. Thanks a lot for the quick response @hussein-awala I did some runs and everything seems normal. Documentation can be confusing though, should I open a separate issue for it?
Documentation can be confusing though, should I open a separate issue for it?
Best is to just fix the docs. It's super easy. Just click "suggest a change on this page" at the bottom-right and you will get a PR opened, and you will be able to update the documentation there without leaving the GitHub UI and submit as a PR (and you will become Airflow Contributor that way as a free bonus) :)
Apache Airflow version
2.6.1
What happened
After upgrading to 2.6.1, GKEStartPodOperator stopped creating pods. According with release notes we created a specific gcp connection. But connection defaults to GKE Public endpoint (in error message masked as XX.XX.XX.XX) instead of private IP which is best since our cluster do not have public internet access.
[2023-05-17T07:02:33.834+0000] {connectionpool.py:812} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f0e47049ba0>, 'Connection to XX.XX.XX.XX timed out. (connect timeout=None)')': /api/v1/namespaces/airflow/pods?labelSelector=dag_id%3Dmytask%2Ckubernetes_pod_operator%3DTrue%2Crun_id%3Dscheduled__2023-05-16T0700000000-8fb0e9fa9%2Ctask_id%3Dmytask%2Calready_checked%21%3DTrue%2C%21airflow-sa
Seems like with this change "use_private_ip" has been deprecated, what would be the workaround in this case then to connect using private endpoint?
Also doc has not been updated to reflect this change in behaviour: https://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/cloud/kubernetes_engine.html#using-with-private-cluster
What you think should happen instead
There should still be an option to connect using previous method with option "--private-ip" so API calls to Kubernetes call the private endpoint of GKE Cluster.
How to reproduce
Operating System
cos_coaintainerd
Versions of Apache Airflow Providers
apache-airflow-providers-cncf-kubernetes==5.2.2 apache-airflow-providers-google==8.11.0
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct