Better design pattern for data_weight synchronization

Bluefog-Lib / bluefog

Distributed and decentralized training framework for PyTorch over graph

https://bluefog-lib.github.io/bluefog/

Apache License 2.0

291 stars 71 forks source link

Better design pattern for data_weight synchronization #83

Open hanbinhu opened 3 years ago

hanbinhu commented 3 years ago

The ready event in neighbor_allreduce dst_weight makes sure the data_weight computation is done before communication, as Pytorch CUDA stream is not synchronized with our CUDA stream.