Are means and coVariances shared by all gpus when using DDP?

blackfeather-wang / ISDA-for-Deep-Networks

An efficient implicit semantic augmentation method, complementary to existing non-semantic techniques.

582 stars 93 forks source link

Are means and coVariances shared by all gpus when using DDP? #15

Closed kuz44ma69 closed 3 years ago

kuz44ma69 commented 3 years ago

I think that means and coVariances are calculated by each GPU in "update_CV()" https://github.com/blackfeather-wang/ISDA-for-Deep-Networks/blob/318c30976d0c412a7dd10250b0164beac6d4fbeb/Image%20classification%20on%20ImageNet/ISDA_imagenet.py#L13 and think that there is no code for integrating these parameters. How are these parameters shared by all gpus?

blackfeather-wang commented 3 years ago

Thank you for your attention.

Indeed, as you noted, similar to BN, these statistics are computed at each GPU. They are not shared by all GPUs. Typically, this will not cause problems given a reasonable batch size.

kuz44ma69 commented 3 years ago

Thank you for your reply. I understand.