tianweiy / MVP

MIT License
274 stars 38 forks source link

MVP reproduction on NuScenes with OpenPCDet (mAP:64.22 NDS:68.96) #24

Open HatakeKiki opened 2 years ago

HatakeKiki commented 2 years ago

Hi! Thank you for your great work! I'm trying to reproduce MVP(CenterPoint-VoxelNet) results with OpenPCDet, but the results seems not good enough compared to your official results (mAP: 66, NDS:69.9). I modify points loading related functions and points dimensions (5 to 22) related code just like yours. I check the voxel features and nothing seems wrong to me. Could it be test time augmentation that leads to this inferior results? It seems that there's no TTA in default config file of CenterPoint in OpenPCDet. Did you get the result of mAP 66 with TTA?

tianweiy commented 2 years ago

i didn't do any TTA for the 69.9 result

tianweiy commented 2 years ago

additionally, the centerpoint baseline in openpcdet is 0.5 nds lower than ours, so now there are about 0.4 nds difference.

do you use similar voxelization as ours? https://github.com/tianweiy/CenterPoint/blob/db36c497a71014961c1ec17042a7524a79d4e792/det3d/models/readers/dynamic_voxel_encoder.py#L19

HatakeKiki commented 2 years ago

additionally, the centerpoint baseline in openpcdet is 0.5 nds lower than ours, so now there are about 0.4 nds difference.

do you use similar voxelization as ours? https://github.com/tianweiy/CenterPoint/blob/db36c497a71014961c1ec17042a7524a79d4e792/det3d/models/readers/dynamic_voxel_encoder.py#L19

My reproduction results of CenterPoint with OpenPCDet (mAP: 58.81, NDS: 66.32) are indeed slightly lower than yours. I use the same voxelization method as yours (dynamic voxelization and pad the points in the same way). But I pad the points immediately after loading like this and it can be voxelized properly. Then I set the first 3 channels of real and virtual points to zero and do MeanVFE and scaling. Lidar points | x | y | z | i | t | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 Real points | x | y | z | 0 | 0 | x | y | z | c | c | c | c | c | c | c | c | c | c | s | t | 1 | 0 Virtual points | x | y | z | 0 | 0 | x | y | z | c | c | c | c | c | c | c | c | c | c | s | t | 0 | 0

I just find that my bin files in gt_database are stored as: Lidar points | x | y | z | i | t | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 Real points | 0 | 0 | 0 | 0 | 0 | x | y | z | c | c | c | c | c | c | c | c | c | c | s | t | 1 | 0 Virtual points | 0 | 0 | 0 | 0 | 0 | x | y | z | c | c | c | c | c | c | c | c | c | c | s | t | 0 | 0

When they were loaded in GT_Aug, the first 3 channels of real and virtual points were still zero and were not assigned to the correct voxels. I'll fix this and see the results.

HatakeKiki commented 2 years ago

@tianweiy Could you tell me your training time of MVP? I used 4 RTX 2080. CenterPoint takes about 31 hours but MVP takes about 10 days. It seems a little bit too long.

HatakeKiki commented 2 years ago

@tianweiy Could you tell me your training time of MVP? I used 4 RTX 2080. CenterPoint takes about 31 hours but MVP takes about 10 days. It seems a little bit too long.

Something wrong with I/O. Fix it. Now it takes 3d to train MVP with 4 RTX 3090.

tianweiy commented 2 years ago

Thanks for updating. It also takes about 2~3 days on 4 V100. I think it depends heavily on IO, CPU speed as the inference time is actually similar to vanilla centerpoint (maybe 20% slower)