megvii-research / PETR

[ECCV2022] PETR: Position Embedding Transformation for Multi-View 3D Object Detection & [ICCV2023] PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images
Other
879 stars 132 forks source link

Reproduce PETR result #38

Closed huazhenliu closed 2 years ago

huazhenliu commented 2 years ago

Thanks for sharing such wonderful & interesting work!!! I'm trying to reproduce the result of "petr_r50dcn_gridmask_p4.py". At the end of this config file, the result is as followed:

mAP: 0.3174

mATE: 0.8397

mASE: 0.2796

mAOE: 0.6158

mAVE: 0.9543

mAAE: 0.2326

NDS: 0.3665

I train with this config file, because I have only 2 V100 cards, I change the batchsize as "samples_per_gpu=2, workers_per_gpu=2,", also use "--autoscale-lr" and not to use it. But my result is almost like this: mAP: 0.2103 mATE: 1.0048 mASE: 0.3099 mAOE: 0.8165 mAVE: 1.1984 mAAE: 0.4087 NDS: 0.2516

I also check the training log you provided (20220606_223059.log), at the end of 24 epochs, your loss is 5.6355, but for my loss, it's about 7.xx. I test the model you provided, result is the same as that in "petr_r50dcn_gridmask_p4.py".

lr, batchsize, or other parameters? Any advices? Thanks!

yingfei1016 commented 2 years ago

Hi, You can set "samples_per_gpu=4, workers_per_gpu=4",so the batchsize is equal to the default setting. Then you don't need use the "--autoscale-lr". When you use a different batchsize, we suggest manually modifying the learning rate instead of using the "--autoscale-lr". For example, you can set "samples_per_gpu=8, workers_per_gpu=4", and "lr=4e-4".

huazhenliu commented 2 years ago

Thanks for your quick reply! So the main reason for the performance drop is "lr and batchsize"?

yingfei1016 commented 2 years ago

Yes,I think is "lr and batchsize".

huazhenliu commented 2 years ago

ok,I will try more experiments. Thanks again.