Closed JasAva closed 4 years ago
Hi,
I achieved the best pretraining performances when training with 4 GPUs and batch size of 64, i.e. 16 images out of 4 identities for each BatchNorm layer on single GPU. If you want to use single GPU for training, you can try to modify the batch size from 64 to 16. Maybe you can achieve better performances.
On the other hand, I found that the performances of pretraining models would not effect the final MMT model too much.
Thanks for the amazing implementation, really inspired.
One question regarding the pre-training the model on stage1, I have tested the released models, they work amazing. However, I have trouble to reproduce the performance using the example scripts, (Duke->Market 24.3 mAP / 52.3 top-1), though I trained the model by using one single GPU, can you shed some insights on how much influence does multi-GPU make? Thanks! :)