Closed raphaelauv closed 2 years ago
Sounds like an interesting edge case and rather simple to debug and track - would you lke maybe to make a PR with a fix for it @raphaelauv ?
@raphaelauv could you provide an image example?
@ulisesojeda
minimal reproductive example :
with dag:
fail_cmd = "fallocate -l 100M toto.img && sleep 60 && echo finish"
k = KubernetesPodOperator(
task_id="task-one",
namespace="default",
name="airflow-test-pod",
image='alpine',
resources={'limit_ephemeral_storage': f"10M"},
cmds=["sh", "-c", fail_cmd],
is_delete_operator_pod=True,
get_logs=True,
in_cluster=False)
Thanks @raphaelauv. I've opened this PR https://github.com/apache/airflow/pull/19713
@potiuk could you check it please?
Apache Airflow Provider(s)
cncf-kubernetes
Versions of Apache Airflow Providers
apache-airflow-providers-cncf-kubernetes==2.0.2
Apache Airflow version
2.1.2
Operating System
GCP Container-Optimized OS
Deployment
Composer
Deployment details
No response
What happened
in the GKE logs I see
So the pod failed because I set to low limit for local storage , but the airflow operator should not raise an exception but fail normally.
What you expected to happen
The KubernetesPodOperator should managed this kind of error
How to reproduce
launch KubernetesPodOperator with a very small local disk limit and run a container using more than this limit
Anything else
No response
Are you willing to submit PR?
Code of Conduct