facebookresearch / moco

PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722
MIT License
4.83k stars 794 forks source link

one question about the implementation #87

Open shuuchen opened 3 years ago

shuuchen commented 3 years ago

Hi,

Thanks for the code.

While q and k are used for positive pairs, q and queue are used for negative pairs, what if earlier samples of q are contained in queue ?

Since a mini batch is sampled randomly, an earlier version of q is possibly in the queue. As are the same instances, they should be positive pairs. However, the are treated negative pairs in the implementation code.

howard-mahe commented 3 years ago

Hello, The mini-batch is not exactly sampled randomly. It is sampled from the dataset shuffled at every epoch, without replacement. When you call enumerate(dataloader), an iterator of a DataLoader is created which stops when it has used once all the samples of the dataset. As long as the queue size is smaller than the dataset size - batch_size, you cannot encounter the same samples in the batch and in the queue.