facebookresearch / moco

PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722
MIT License
4.83k stars 794 forks source link

About shuffleBN with pytorch ddp #37

Closed boringwar closed 4 years ago

boringwar commented 4 years ago

I am new to pytorch DistributedDataParallel (DDP), and not clear about the shuffleBN process.

In the code, you first do _concat_allgather(), and then broadcast a random indexes to every devices from src=0.

Here is my question: Is only device 0 broadcasting? Does other devices doing __batch_shuffleddp()?

boringwar commented 4 years ago

After some search, I think I have understood this point.