I can not reproduce the results - Githubissues

timmeinhardt / trackformer

Implementation of "TrackFormer: Multi-Object Tracking with Transformers”. [Conference on Computer Vision and Pattern Recognition (CVPR), 2022]

https://arxiv.org/abs/2101.02702

Apache License 2.0

487 stars 113 forks source link

I can not reproduce the results #87

Open loseevaya opened 1 year ago

loseevaya commented 1 year ago

Hey, thanks for your excellent work! I train TrackFormer on your default setting (load from pretrained CrowdHuaman checkpoint) on the joint set of CrowdHuman and MOT17, and i get result on MOT17 about 74.0 MOTA, but when i submit it to the motchallenge,i only get 72.7 MOTA, i changed the batch_size to 1,and keep other parameters unchanged. Why is this happening?

timmeinhardt commented 1 year ago

Cause you changed the batch size to 1. :) A different batch size means you must find new optimal learning rates and training epochs.

quxu91 commented 1 year ago

Cause you changed the batch size to 1. :) A different batch size means you must find new optimal learning rates and training epochs.

What should I change the learning rates and epochs to when I set the batch_size =1?

quxu91 commented 1 year ago

Hey, thanks for your excellent work! I train TrackFormer on your default setting (load from pretrained CrowdHuaman checkpoint) on the joint set of CrowdHuman and MOT17, and i get result on MOT17 about 74.0 MOTA, but when i submit it to the motchallenge,i only get 72.7 MOTA, i changed the batch_size to 1,and keep other parameters unchanged. Why is this happening?

I met the same problem, have you reproduced the results? And what training rates and epochs had you set?

timmeinhardt commented 1 year ago

I do not know. You have to find new optimal LRs and epochs. A recommend starting point could be to half the learning rates just as you did with the batch size from 2 to 1. But there is no guarentee this will yield the same results. In fact, we tried working with batch_size=1 for a while but never achieved the same top performance as with batch_size=2.

quxu91 commented 1 year ago

Thanks for your early reply, should I set all the LRs (includes lr, lr_backbone, lr_track, lr_linear_proj_mult intrain.yaml)to half? And the weight_decay should I change? Why does it appear different when using different batch sizze ?

timmeinhardt commented 1 year ago

Only the learning rates not the multiplicators (lr_linear_proj_mult). The weight decay can remain as it is.

What appears different with different batch sizes? You mean why do you have to set different LRs for different batch sizes?

quxu91 commented 1 year ago

Only the learning rates not the multiplicators (lr_linear_proj_mult). The weight decay can remain as it is.

What appears different with different batch sizes? You mean why do you have to set different LRs for different batch sizes?

Yes! As I set the batch_size to 1, are there any other parameters other than the learning rate aforementioned that could have an impact on the results?

timmeinhardt commented 1 year ago

Explaining the relation between batch size and learning rate goes beyond the support of this repository. :)

You might have to adjust the number of epochs. But again you will most likely not get the same results easily. This requires some potentially expensive hyperparameter tuning.

quxu91 commented 1 year ago

I get it! Thanks for your enthusiastic answer anyway!