Hi, thank you very much for sharing the code. I am doing some research using the MoCo model, and I am using this code in my experiments.
I would like to ask if it is possible to use a smaller mini-batch size (e.g. 32, which is smaller than the 256 in the MoCO paper)?
I like very much the fact that MoCo decouples the dictionary size from the mini-batch size. Does this mean that one could use a smaller batch size (while still keep the dictionary size large) without a substantial drop in the performance?
Hi, thank you very much for sharing the code. I am doing some research using the MoCo model, and I am using this code in my experiments.
I would like to ask if it is possible to use a smaller mini-batch size (e.g. 32, which is smaller than the 256 in the MoCO paper)?
I like very much the fact that MoCo decouples the dictionary size from the mini-batch size. Does this mean that one could use a smaller batch size (while still keep the dictionary size large) without a substantial drop in the performance?