yxgeee / MMT

[ICLR-2020] Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification.
https://yxgeee.github.io/projects/mmt
MIT License
469 stars 73 forks source link

Parameters for resnet50IBNa #6

Closed yihongXU closed 4 years ago

yihongXU commented 4 years ago

Hi,

Thank you for the great work! I tried your code and things worked well, especially for ResNet-50, the result looked good. But my evaluation curve of mAP for ResNet-50IBNa, 700 saturates at ~ mAP=55, which is lower than the result for ResNet-50 and surprising since the reported results look better with ResNet-50IBNa. For this reason, I would like to ask if the training hyper parameters are the same as in train.sh for ResNet-50IBNa, if not, would you mind to share them, please?

PS: I used 1 gpu for the moment, so num_instances=1. Thank you again, and it is indeed a great work! Congrats!

yxgeee commented 4 years ago

Hi,

I adopted all the same settings for IBN-ResNet50. I have several suggestions here,

  1. have you well downloaded and loaded the pre-trained model for IBN-ResNet50? (refer to https://github.com/yxgeee/MMT#prepare-pre-trained-models)
  2. I found that the numbers of GPUs and num_instances indeed effect the final performances. They may have more influence on the backbone of IBN-ResNet50, since the main difference between IBN-ResNet50 and conventional ResNet50 is the construction of batch normalization layers, which is sensitive to the batch size and batch structure. Try to utilize the same experimental settings as I proposed first if you have enough GPUs. Otherwise, you can try to decrease the batch_size (e.g. 16) to fit your GPU memory but keep the num_instances as 4.
yihongXU commented 4 years ago

Hi,

I adopted all the same settings for IBN-ResNet50. I have several suggestions here,

1. have you well downloaded and loaded the **pre-trained model** for IBN-ResNet50? (refer to https://github.com/yxgeee/MMT#prepare-pre-trained-models)

2. I found that the numbers of GPUs and num_instances indeed effect the final performances. They may have more influence on the backbone of IBN-ResNet50, since the main difference between IBN-ResNet50 and conventional ResNet50 is the construction of batch normalization layers, which is sensitive to the batch size and batch structure. Try to **utilize the same experimental settings** as I proposed first if you have enough GPUs. Otherwise, you can try to **decrease the batch_size (e.g. 16) to fit your GPU memory but keep the num_instances as 4**.

Hi, thank you for your reply.

  1. I went directly to target domain training, so I loaded the IBN-ResNet50 models pretrained on source dataset (Market or Duke).
  2. Ok, I will try both of your suggestons, thank you!

For now, my one gpu result with batch=64, num_instance=4: mAP: m2d: 0.7034, d2m: 0.6455.

I will report new results and close the issue once I finished your suggested experiments.

Thank you again!