apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

handle failed executor event #602

Open ChenLingPeng opened 6 years ago

ChenLingPeng commented 6 years ago

Signed-off-by: forrestchen forrestchen@tencent.com

see #600

(Please fill in changes proposed in this fix)

How was this patch tested?

manual test.

liyinan926 commented 6 years ago

BTW: this fix should also be upstreamed. Can you file a PR against upstream apache/master?

ChenLingPeng commented 6 years ago

A general question: how does Yarn handle this case, i.e., of executors that fail to register?

Not so familiar with spark-on-yarn, I think if allocateResponse.getCompletedContainersStatuses can return this kind of executors, then in yarn mode, it can handle this scenario just like registered but failed executor.

this fix should also be upstreamed

Will do this after this is merged