facebookresearch / moco

PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722
MIT License
4.83k stars 794 forks source link

Implementation Multiple nodes. #16

Closed lihao0374 closed 4 years ago

lihao0374 commented 4 years ago

Solved

KaimingHe commented 4 years ago

This code is a direct revision on the official PyTorch ImageNet training code and its multi-node version could be done following that. The speedup ratio would be pretty reasonable just like the official PyTorch ImageNet training code. In general, I suggest you follow the "ImageNet in 1 hour" paper for multi-node training: specifically the linear lr recipe and the related discussions on other relevant hyper-parameters. Depending on how many nodes you use (basically the batch size), there could be a slight accuracy drop, similar to what is observed also in supervised training.

wenyuqing commented 3 years ago

Hi, have you successfully implemented multi-node version of moco? I meet some problem when running with multi-node. The buffer of queue seems not synchronized and the variable ptr is not 0 at the begining. I am confused about this issue.