Hi, thanks for your excellent work.
I have a question: Could the dictionary's size (memory bank) surpass the dataset's size? In the original paper, the training dataset is ImageNet so that the dictionary can be 65536, but if we have only 10000 images, and because during the training, the queue is updated by a batch of data, when the last batch of data is pushed into the queue, the length of the queue is 10000, which is fas small than 65536, and in the next iteration, the first batch is pushed into the queue again, but in fact, the first batch in the last iteration has not been popped, so they should a positive example mutually, but MoCo will treat them as a negative example, which is not correct.
Hi, thanks for your excellent work. I have a question: Could the dictionary's size (memory bank) surpass the dataset's size? In the original paper, the training dataset is ImageNet so that the dictionary can be 65536, but if we have only 10000 images, and because during the training, the queue is updated by a batch of data, when the last batch of data is pushed into the queue, the length of the queue is 10000, which is fas small than 65536, and in the next iteration, the first batch is pushed into the queue again, but in fact, the first batch in the last iteration has not been popped, so they should a positive example mutually, but MoCo will treat them as a negative example, which is not correct.