Can not reproduce the result by loading the provided checkpoint on my machines

klightz commented 2 years ago

Hello, sorry for another question, recently I try to train the model by myself, and I get a 4-5 points lower result than the README report. So I try to load the checkpoint you provide first for a valid test.

However, I found that the checkpoint I loaded also in a lower performance. I also tried to setup the environment independently on my another machine and directly do the checkpoint test, still get a worse result, same number as my previous one. I suspect it may caused by some package version mismatch, or some api behavior. Any Idea about this?

I think I exactly follow the guideline for the dataset and codebase setup. For the evaluation command, I follow the README and run:

CONFIG="nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal"
python -m torch.distributed.launch --nproc_per_node=${NUM_GPUS} ./tools/dist_test.py configs/nusc/voxelnet/$CONFIG.py --work_dir ./work_dirs/$CONFIG --checkpoint centerpoint_focal_multimodal.pth

Here are my key packages version:

torch==1.8.2
opencv-python==4.4.0.46
kornia==0.6.6
spconv==2.1.22

The output of the prediction:

Loading NuScenes tables for version v1.0-trainval...
23 category,
8 attribute,
4 visibility,
64386 instance,
12 sensor,
10200 calibrated_sensor,
2631083 ego_pose,
68 log,
850 scene,
34149 sample,
2631083 sample_data,
1166187 sample_annotation,
4 map,
Done loading in 39.821 seconds.
======
Reverse indexing ...
Done reverse indexing in 9.6 seconds.
======
Finish generate predictions for testset, save to work_dirs/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal/infos_val_10sw
eeps_withvelo_filter_True.json
Initializing nuScenes detection evaluation
Loaded results from work_dirs/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal/infos_val_10sweeps_withvelo_filter_True.jso
n. Found detections for 6019 samples.
Loading annotations for val split from nuScenes version: v1.0-trainval
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 6019/6019 [00:17<00:00, 350.05it/s]
Loaded ground truth annotations for 6019 samples.
Filtering predictions
=> Original number of boxes: 497297
=> After distance based filtering: 357680
=> After LIDAR and RADAR points based filtering: 357680
=> After bike rack filtering: 357308
Filtering ground truth annotations
=> Original number of boxes: 187528
=> After distance based filtering: 134565
=> After LIDAR and RADAR points based filtering: 121871
=> After bike rack filtering: 121861
Rendering sample token 5376e3a2874542d8b440faa899e52b97
Rendering sample token 14f665de1fa34d0a9d12838a5b77d687
Rendering sample token c428be7e072c4c2489b90a6dcefcae4c
Rendering sample token d6d3eac48860468aa0eba1ae2896b5ea
Rendering sample token 67aad7ad948f44f8af668ea8389bdd52
Rendering sample token 9c9f22a58fdc45f2b8a119cda3554f1f
Rendering sample token e30f071748cc49eb85babe49265a4eda
Rendering sample token f4550267cd0240e1a1ceb844e33e97d4
Rendering sample token 93fdce35d7db4764ad5f822f57ab49e2
Rendering sample token 22186f4894ab46b481a9e1ee31d7734e
Accumulating metric data...
Calculating metrics...
Rendering PR and TP curves
Saving metrics to: ./work_dirs/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal
mAP: 0.6030
mATE: 0.2830
mASE: 0.2549
mAOE: 0.2796
mAVE: 0.2552
mAAE: 0.1885
NDS: 0.6754
Eval time: 110.2s
Per-class results:
Object Class    AP      ATE     ASE     AOE     AVE     AAE
car     0.854   0.178   0.155   0.108   0.267   0.193
truck   0.563   0.311   0.179   0.067   0.243   0.235
bus     0.699   0.314   0.179   0.049   0.420   0.274
trailer 0.396   0.513   0.209   0.442   0.191   0.177
construction_vehicle    0.221   0.666   0.425   0.865   0.120   0.284
pedestrian      0.853   0.140   0.275   0.377   0.212   0.093
motorcycle      0.618   0.200   0.244   0.229   0.387   0.240
bicycle 0.458   0.166   0.266   0.300   0.202   0.011
traffic_cone    0.686   0.140   0.333   nan     nan     nan
barrier 0.683   0.201   0.284   0.080   nan     nan
Evaluation nusc: Nusc v1.0-trainval Evaluation
car Nusc dist AP@0.5, 1.0, 2.0, 4.0
76.30, 85.90, 89.02, 90.20 mean AP: 0.8535605962619657
truck Nusc dist AP@0.5, 1.0, 2.0, 4.0
38.66, 55.67, 63.59, 67.33 mean AP: 0.563122813202771
construction_vehicle Nusc dist AP@0.5, 1.0, 2.0, 4.0
3.60, 14.35, 29.32, 40.98 mean AP: 0.22061467261075263
bus Nusc dist AP@0.5, 1.0, 2.0, 4.0
45.98, 70.61, 80.23, 82.79 mean AP: 0.6990066750667856
trailer Nusc dist AP@0.5, 1.0, 2.0, 4.0
10.95, 34.49, 51.42, 61.45 mean AP: 0.3957810641400781
barrier Nusc dist AP@0.5, 1.0, 2.0, 4.0
58.83, 68.34, 72.09, 73.76 mean AP: 0.682558135475819
motorcycle Nusc dist AP@0.5, 1.0, 2.0, 4.0
53.71, 63.50, 64.63, 65.32 mean AP: 0.6179004111724922
bicycle Nusc dist AP@0.5, 1.0, 2.0, 4.0
43.90, 45.99, 46.47, 46.84 mean AP: 0.45797520318286145
pedestrian Nusc dist AP@0.5, 1.0, 2.0, 4.0
83.08, 84.88, 85.97, 87.29 mean AP: 0.8530652142070791
traffic_cone Nusc dist AP@0.5, 1.0, 2.0, 4.0
65.87, 67.16, 69.11, 72.33 mean AP: 0.6861473045618407

So you can see i get a 60.3 on the full dataset, and I get 56 map on 1/4 dataset.

BTW, I also check the dataset correctness by observing the behavior of the CenterPoint performance. I load the checkpoint under this setup centerpoint_voxel_1440 and I can exactly get the 59.6 mAP they as report. So i think something goes wrong with the image fusion part. Any Idea about this issue? Really thanks a lot!

yukang2017 commented 2 years ago

Hi,

I think there are several potential reasons that might cause this issue.

The first is that did all weights are loaded properly from the checkpoint? Because the share between different spconv version might be different, caused from the Algo of convolution layers. There should be warnings reports on the testing log, if this is the reason. But I think the probability of this is small. Because the performance should be much lower than this, not only 4 points.

The second potential reason might be from the dataset info files. They should be regenerated to include the image parts. Would you please provide your "infos_val_10sweeps_withvelo_filter_True.pkl" file? I will check it on our machine. In addition, this is the file I use. You can check it on yours.

The third potential reason is that I change this file a week ago, commit. I am not sure whether this commit will cause this issue. It seems that this should not be relavant. You can roll it back for checking on the testing only.

In addition, the spconv version we have tested on is 2.1.21 and before. I am not so sure whether the update on spconv 2.1.22 causes this issue.

You can contact me at the WeChat 13261057196 to have a more frequent discussion, in case that I am not alway checking the GitHub issues in time.

klightz commented 2 years ago

Thanks a lot. I agree that the second one might be the possible reason since I do not make any specific change for that file after running the script from CenterPoint. Here is my val pkl file Google Drive.

BTW, I can not see your URL for infos_val_10sweeps_withvelo_filter_True.pkl from the above message, maybe you could have a double check?

I may also try to downgrade the spconv for a try. Thanks a lot for the quick reply.

yukang2017 commented 2 years ago

Sorry. I am trying to upload it to the google drive. But the speed is a bit slow. Hope for your understanding.

klightz commented 2 years ago

Of course, take your time, just to ensure I do not miss something.

yukang2017 commented 2 years ago

Thanks for pointing out this issue. Note that this is fixed by this commit.

ZecCheng commented 1 year ago

Hi klightz, I have faced the same issue like you. Did you solve the problem? Was it nfos_val_10sweeps_withvelo_filter_True.pkl lack of image info that causes this gap? Or different version of spconv? Looking forward to your reply, thanks a lot! : D

dvlab-research / FocalsConv

Can not reproduce the result by loading the provided checkpoint on my machines #16