sql-machine-learning / elasticdl

Kubernetes-native Deep Learning Framework
https://elasticdl.org
MIT License
733 stars 113 forks source link

Add the PodStateFlow from PENDING to SUCCEEDED and FAILED. #2466

Closed brightcoder01 closed 3 years ago

brightcoder01 commented 3 years ago

If the program in the worker pods completes or fails very fast, there will be pending and then succeeded/failed k8s events. And the running event will be missing. For this scenario, we add the pod state transfer logic from pending to succeeded/failed.