Hi. I found that the MoCo model in your implementation will not call the batch_shuffle_ddp() and batch_unshuffle_ddp before and after normalizing. Could you explain the reason? Will these two functions influence the performance, or some other impacts?
Hi. I found that the MoCo model in your implementation will not call the batch_shuffle_ddp() and batch_unshuffle_ddp before and after normalizing. Could you explain the reason? Will these two functions influence the performance, or some other impacts?