Open dengandong opened 2 years ago
Since we use 16 GPUs (V100 32G) for batch size = 128 per GPU, which means the total batch size is 2048, it is 4 times bigger than your setting. For our setting, it takes us about 35 hours to train 100 epochs. Since we pre-train the backbone, FPN, and the RCNN-Head, which is much bigger than the MoCo training the backbone only, the speed is slower than MoCo as expected.
hi, hologerry~
I'm currently run your code on 4 V100 32G. I found it took about 1.3s for each iteration (batch size =128/GPU), thus the total training time for 100 epochs is about 7 days.
Does 1.3s sounds normal for you? I ran MoCo on same machines, and it took about 0.5s/iteration.
I'd appreciate if you can help me with this~ Thanks!