Open ionescur2 opened 6 months ago
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
Last log: [2024-03-01 13:31:43,814] The xcom sidecar container is not yet started.
Technically, there is no difference between the code of the two versions, where tertools.count()
is an infinite generator for incremental numbers. IMHO we need to add a timeout for the waiting time, similar to what we do for the base container, and when we reach this timeout, we should fail the task instead of waiting forever.
I created PR #40909 to fix this issue. Please check it!
Apache Airflow version
Other Airflow 2 version (please specify below)
If "Other Airflow 2 version" selected, which one?
2.7.2
What happened?
Error: Xcom_sidecar_container sometimes is not started which will result in a long running dag waiting for the container to start but it will never start again.
There is a small change in the following method: _await_xcom_sidecar_containerstart from _/airflow/providers/cncf/kubernetes/utils/podmanager.py.
Version 2.6.0 code:
Version 2.7.2 code:
We believe that the old version code with the while true: inside is better suited because it checks if the container is started.
What you think should happen instead?
We believe that this code will wait for the the container to start.
How to reproduce
Sometimes the XCOM container is not started. Stop the container and see the reaction of the DAG.
Operating System
Linux
Versions of Apache Airflow Providers
No response
Deployment
Amazon (AWS) MWAA
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct