Closed shaileshj2803 closed 2 years ago
what is the effective batch size on which the contrastive loss is computed in case of multiple GPUs?
If representations vectors are gathered for each GPU from all GPUs, the effective batch size will be equal the total number examples on all GPUs.
Thanks a lot
what is the effective batch size on which the contrastive loss is computed in case of multiple GPUs?