YyzHarry / imbalanced-semi-self

[NeurIPS 2020] Semi-Supervision (Unlabeled Data) & Self-Supervision Improve Class-Imbalanced / Long-Tailed Learning
https://arxiv.org/abs/2006.07529
MIT License
735 stars 115 forks source link

Can't achieve the given performance: ResNet-50 + SSP+CE(Uniform) for imageNet-LT #17

Closed ChCh1999 closed 3 years ago

ChCh1999 commented 3 years ago

I download the pre-trained model from the given path Resnet-50-rot. And train the model with the given config imagenet_inat/config/ImageNet_LT/feat_uniform.yaml The training cmd is: python imb_cls/imagenet_inat/main.py --cfg 'imb_cls/imagenet_inat/config/ImageNet_LT/feat_uniform.yaml' --model_dir workdir/pretrain/moco_ckpt_0200.pth.tar. I only get 41.1 top-1 accuracy but the given model achieved 45.6 [CE(Uniform) + SSP].

Can you help me check where is the problem? image image

YyzHarry commented 3 years ago

Hi, I double checked the config, which seems fine to me. So currently I'm not sure what exactly causes the difference.

I would suggest that you could first run the baseline model (without SSP), and compare the model performance w/ & w/o SSP. If that gives you reasonable gains, then you might want to check whether the baseline model matches the number reported in our paper. If the performance of the baseline model is also lower than the number reported, then it might be due to the hyperparameters and the training settings.

Otherwise, you may want to check whether the pre-trained weights are loaded correctly, or the exact training setting such as PyTorch version (1.4 for this repo), etc.

ChCh1999 commented 3 years ago

The baseline result is image my running environment is image And I think the model is loaded correctly because I've tested the given model provided by you and got the correct result. Does the following output mean the weight for resnet50 loaded? image Thanks for your answer.

YyzHarry commented 3 years ago

Yes, I think that indicates your loading process is correct. Seems there's a reasonable gap though, but I'm not sure what causes the difference. I would suggest that you could tune the hyper-parameters a bit; I took a quick pass on the original OLTR config, and seems they use 0.1 as initial LR. Also, it is observed that training for longer epochs could lead to better results, so you might want to increase the epoch numbers to see what you can get. Hope this helps!