Open zhifanlight opened 2 years ago
Hi, thank you for reproducing our work and the baseline methods.
When we run baseline method on COCO, the learning rate and other configs follow previous multi-label work, e.g. ML-GCN, MS-CMA. Specifically, the learning rate for backbone and fc are 0.01 (also mentioned in our paper), step_size is 15, and we train about 40 epochs. You can try using this settiing for reproducing baseline (can be some variance in terms of the results , but not too high).
When we use our method, since CSRA is a special module, we enlarge fc's learning rate to 0.1 for faster convergence, and step size and total epochs have been shortened consequently.
Best, Authors.
Hi, thanks for your excellent work! But I'm confused of the detail about baseline-model settings in your paper.
Take training resnet-101 without cutmix on coco2014 as an example:
With the following training configurations as baseline setting, I get 81.3 mAP after 7 epochs (30 in total, still in training process...), which is much higher than that in your paper (79.4 mAP).
python main.py --num_heads 4 --lam 0 --dataset coco --num_cls 80 --checkpoint coco14/resnet101
So, what is the correct settings to reproduce the baseline result as in your paper? Thanks again.