Open offthewall123 opened 3 years ago
These are the "global barriers":
TF: optimizer.apply_gradients()
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/training/optimizer.py#L539
PyTorch: optimizer.step(), e.g., SGD's https://github.com/pytorch/pytorch/blob/master/torch/optim/sgd.py#L77
"global barrier" is a conceptual name. It just means that the framework would synchronize all the communication before moving on to the forward propagation in the next iteration.
In paper https://i.cs.hku.hk/~cwu/papers/yhpeng-sosp19.pdf, there mentioned a concept global barrier in Tensorflow between successive iterations. – the global barrier waits for all communication operations to finish before moving on to the next iteration
But not found any discuss on global barrier in tensorflow or pytorch, want to make sure that is there really a global barrier in Tensorflow?And do we have some code reference for it?