sql-machine-learning / elasticdl

Kubernetes-native Deep Learning Framework
https://elasticdl.org
MIT License
733 stars 113 forks source link

Compare the process of rank 0 selection between Horovod elastic and elasticdl #2401

Closed brightcoder01 closed 3 years ago

workingloong commented 3 years ago

Can you provide more details?

brightcoder01 commented 3 years ago

From the horovod document, During rendezvous, older workers will take priority in being assigned worker-0 status to ensure that the state that is broadcast is up to date.. The strategy is the same between horovod elastic and ElasticDL.

brightcoder01 commented 3 years ago

We have gotten the conclusion, close this issue.