open-mmlab / mmdetection3d

OpenMMLab's next-generation platform for general 3D object detection.
https://mmdetection3d.readthedocs.io/en/latest/
Apache License 2.0
5.02k stars 1.49k forks source link

Zero mAP when using tools/test.py with checkpoints from tools/train.py #1532

Open SaketChaturvedi opened 2 years ago

SaketChaturvedi commented 2 years ago

I was trying to evaluate the mvxnet model trained on the Kitti 3d dataset using the tools/test.py file. I am using the following command to run the test.py file:

python tools/test.py configs/mvxnet/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class.py training_results/epoch_18.pth --eval mAP --eval-options 'out_dir=./results_eval_n'

The problem is it gives zero mAP when I used the checkpoint generated from training (tools/train.py) to perform the evaluation using tools/test.py. However, the testing script works fine when I used the pretrained mvxnet model checkpoints using following command:

python tools/test.py configs/mvxnet/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class.py ./checkpoints/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class_20210831_060805-83442923.pth --eval mAP --eval-options 'out_dir=./results_eval'

I can use the training_results/epoch_18.pth checkpoint to resume my training. I also tried updating "training_results/epoch_18.pth" model checkpoints using publish_model.py. Still, with a newly generated checkpoint file, I get the zero mAP during testing. I suspect there might be something wrong with the training checkpoint "training_results/epoch_18.pth". Is there anything that I am missing? How do I fix my problem?

Thank you so much for the help!

I am using MMDetection3D V1.0.0rc2 Release with the following environment details:

'mmdet_version': '2.24.1, 'mmseg_version': '0.24.1', 'mmdet3d_version': '1.0.0rc2' 'mmcv-full_version': '1.5.0'

I got the following zero mAP results:

----------- AP40 Results ------------ Pedestrian AP40@0.50, 0.50, 0.50: bbox AP40:0.0000, 0.0000, 0.0000 bev AP40:0.0000, 0.0000, 0.0000 3d AP40:0.0000, 0.0000, 0.0000 Pedestrian AP40@0.50, 0.25, 0.25: bbox AP40:0.0000, 0.0000, 0.0000 bev AP40:0.0000, 0.0000, 0.0000 3d AP40:0.0000, 0.0000, 0.0000 Cyclist AP40@0.50, 0.50, 0.50: bbox AP40:0.0000, 0.0000, 0.0000 bev AP40:0.0000, 0.0000, 0.0000 3d AP40:0.0000, 0.0000, 0.0000 Cyclist AP40@0.50, 0.25, 0.25: bbox AP40:0.0000, 0.0000, 0.0000 bev AP40:0.0000, 0.0000, 0.0000 3d AP40:0.0000, 0.0000, 0.0000 Car AP40@0.70, 0.70, 0.70: bbox AP40:0.0000, 0.0000, 0.0000 bev AP40:0.0000, 0.0000, 0.0000 3d AP40:0.0000, 0.0000, 0.0000 Car AP40@0.70, 0.50, 0.50: bbox AP40:0.0000, 0.0000, 0.0000 bev AP40:0.0000, 0.0000, 0.0000 3d AP40:0.0000, 0.0000, 0.0000

Overall AP40@easy, moderate, hard: bbox AP40:0.0000, 0.0000, 0.0000 bev AP40:0.0000, 0.0000, 0.0000 3d AP40:0.0000, 0.0000, 0.0000

Tai-Wang commented 2 years ago

Have you ever compared your training log with our provided log? Is there anything abnormal during training, for example, how about the loss?

SaketChaturvedi commented 2 years ago

Thank you @Tai-Wang for your reply. Yes, I have verified the training log. It looks fine to me as I can get evaluation results on the validation dataset during training. I have attached the training log and training loss curves for your reference. Is there anything abnormal you found in my training logs?

training_loss_curves 20220513_123317.log

Tai-Wang commented 2 years ago

So it might be a bug related to the inference process? We will first try to reproduce your problem soon. BTW, what is the commid id of mmdet3d are you using?

SaketChaturvedi commented 2 years ago

Okay. Thanks. I am using mmdet3d commit on May 1, 2022 (76e351a). Please let me know if you need any additional information.

jshilong commented 2 years ago

I run the same config with the same commit id for 3 epochs, and can not reproduce your problem. I record e evaluation results in the training phase and then test checkpoint again with test.py, finally getting the same results. Maybe you should clone the projects and make sure there is no modification then try again.

SaketChaturvedi commented 2 years ago

I tried cloning the mmdetection3d repository and still have the same problem. I can get evaluation results in the training phase, but when I load the checkpoint again with the test.py, it doesn't detect anything and gives zero mAP results. Do you see any factors which can cause the problem with the checkpoint? Is there any way to verify the checkpoint?

Joaovsky commented 1 year ago

Hey there @SaketChaturvedi, I have the exact same proble with the same model MVXNET. Did you find out about the issue? If so, could you tell me please?

SaketChaturvedi commented 1 year ago

Hey @Joaovsky, Not really. The same problem is still there for me. Did you find anything?