Open InugYoon opened 1 year ago
Hi @InugYoon,
I assume you mean the implementation with the momentum encoder trick. If so, I adopt most hyper-parameters from MoCo-v2. IIRC, the differences are that: (1) I used a queue size of 8192; (2) temperature is 0.07; (3) batch size of 1024.
If you want to be faithful to the paper (not using the momentum encoder trick), please consider following the paper.
Hello @HobbitLong, thank you for quick reply.
First of all, I wanted to re-implement the results based on code here, without using Moco-trick, following the paper. For the CIFAR10/100 dataset, I could follow the script that you uploaded (with detailed hyperparameters including lr_rate, schedule method (milestone vs cosine), decay, etc) and successfully re-implemented the results.
However for the ImageNet, I couldn't find the listed hyperparameters both on github or the paper. I tried with some hyperparameters, but earned nearly -10% performances.
Now with your kind reply, I noticed that you used the hyperparameters from Moco-v2 IIRC. Then is there any source to know moco-trick exactly is? Is there any github or code that I could know about?
About the moco-v2 hyperparameters, did you used from here? https://github.com/facebookresearch/moco
Hello @HobbitLong,
I have incorporated most of the hyper-parameters from MoCo-v2, which align with the ones you previously mentioned. However, there appears to be a gap in accuracy, and I postulate that this may be due to the number of epochs. Thus, I would like to kindly request your assistance in providing me with the appropriate number of epochs or any other hyper-parameters that may help to improve the accuracy of my model.
Thank you for your time and consideration.
Hello @kiimmm,
Would you please release your MoCo version code? I'm trying to reimplement it but got stuck in the MoCo code. Thanks.
Hi, I am trying to reproduce the results. May I get the hyperparameters for ImageNet experiment?