Open hterik opened 9 months ago
This is a good suggestion - we have the log_events_on_failure
arg on the KubernetesPodOperator, which can be really useful in diagnosing pod starting issues like missing images or volumes - having similar behaviour in the KubeExecutor would be rad.
Feel free to assign this to me and I can put a fix together next week 👍
@SamWheating will you have time to work on it?
Ah sorry, I never got assigned and it fell off my radar.
I don't think I'll have time for this in the near future so feel free to assign to someone else.
I will pick it , can you assign it to me ?
Description
We occasionally see KubernetesExecutor tasks getting lost in cyberspace with no logs describing why in the airflow UI.
If admins looks into the scheduler logs (Airflow 2.7.1), the following can be seen:
It would be a lot easier to debug such issues if A). The scheduler logs somehow mentioned the Pod failure Reason=Evicted and status=Failed. These can be found on the V1Pod object returned by kubernetes API. B) The Airflow UI somehow surfaced this error, instead of not showing anything at all.
Use case/motivation
No response
Related issues
No response
Are you willing to submit a PR?
Code of Conduct