BEVFusion: Wrong pytorch model/keys missing or misaligned while running 'demo'

pravn commented 1 year ago

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] I have read the FAQ documentation but cannot get the expected help.
[X] The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

I am unable to run the demo example in the README. It appears that the model has incompatible/missing keys:

The following error is obtained: `Loads checkpoint by local backend from path: bevfusion_converted.pth
The model and loaded state dict do not match exactly

Reproduces the problem - code sample

python demo/multi_modality_demo.py demo/data/nuscenes/n015-2018-07-24-11-22-45+0800__LIDAR_TOP__1532402927647951.pcd.bin demo/data/nuscenes/ demo/data/nuscenes/n015-2018-07-24-11-22-45+0800.pkl projects/BEVFusion/configs/bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d.py ${CHECKPOINT_FILE} --cam-type all --score-thr 0.2 --show

Reproduces the problem - command or script

python demo/multi_modality_demo.py demo/data/nuscenes/n015-2018-07-24-11-22-45+0800__LIDAR_TOP__1532402927647951.pcd.bin demo/data/nuscenes/ demo/data/nuscenes/n015-2018-07-24-11-22-45+0800.pkl projects/BEVFusion/configs/bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d.py ${CHECKPOINT_FILE} --cam-type all --score-thr 0.2 --show

Reproduces the problem - error message

The following error is obtained:

The model and loaded state dict do not match exactly

unexpected key in source state_dict: vtransform.dx, vtransform.bx, vtransform.nx, vtransform.frustum, vtransform.dtransform.0.
weight, vtransform.dtransform.0.bias, vtransform.dtransform.1.weight, vtransform.dtransform.1.bias, vtransform.dtransform.1.ru
nning_mean, vtransform.dtransform.1.running_var, vtransform.dtransform.1.num_batches_tracked, vtransform.dtransform.3.weight,
vtransform.dtransform.3.bias, vtransform.dtransform.4.weight, vtransform.dtransform.4.bias, vtransform.dtransform.4.running_mean, vtransform.dtransform.4.running_var, vtransform.dtransform.4.num_batches_tracked, vtransform.dtransform.6.weight, vtransfo
rm.dtransform.6.bias, vtransform.dtransform.7.weight, vtransform.dtransform.7.bias, vtransform.dtransform.7.running_mean, vtra
nsform.dtransform.7.running_var, vtransform.dtransform.7.num_batches_tracked, vtransform.depthnet.0.weight, vtransform.depthne
t.0.bias, vtransform.depthnet.1.weight, vtransform.depthnet.1.bias, vtransform.depthnet.1.running_mean, vtransform.depthnet.1.
running_var, vtransform.depthnet.1.num_batches_tracked, vtransform.depthnet.3.weight, vtransform.depthnet.3.bias, vtransform.d
epthnet.4.weight, vtransform.depthnet.4.bias, vtransform.depthnet.4.running_mean, vtransform.depthnet.4.running_var, vtransfor
m.depthnet.4.num_batches_tracked, vtransform.depthnet.6.weight, vtransform.depthnet.6.bias, vtransform.downsample.0.weight, vt
ransform.downsample.1.weight, vtransform.downsample.1.bias, vtransform.downsample.1.running_mean, vtransform.downsample.1.runn
ing_var, vtransform.downsample.1.num_batches_tracked, vtransform.downsample.3.weight, vtransform.downsample.4.weight, vtransfo
rm.downsample.4.bias, vtransform.downsample.4.running_mean, vtransform.downsample.4.running_var, vtransform.downsample.4.num_b
atches_tracked, vtransform.downsample.6.weight, vtransform.downsample.7.weight, vtransform.downsample.7.bias, vtransform.downs
ample.7.running_mean, vtransform.downsample.7.running_var, vtransform.downsample.7.num_batches_tracked

Additional information

No response

sunjiahao1999 commented 1 year ago

You should use new checkpoints in BEVFusion README.md

hepingpeace commented 1 year ago

You should use new checkpoints in BEVFusion README.md

Thank you new checkpoints, but when I use this checkpoint and use demo I find bboxes did not match to objects. it is demo find annotation file is different from that after tools/create_data.py nuscenes programs file?? and the bevfusion_converted.pth also has same problem.

pravn commented 1 year ago

I got the keys to align by hacking it. Load checkpoints and replace vtransform with view_transform to make it work. However, @hepingpeace notes, the boxes produced are incorrect. demo

Using the new checkpoint given above by @sunjiahao1999 seems to produce the correct result. demo

jeromefan commented 1 year ago

Hi, I encountered the same issue, when I use the pretrained model (name is bevfusion_converted.pth) from the doc of mmdet3d, the result is the same as @pravn 's first pic:

bevfusion_converted

After check this issue, I tried the pretrained model (bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d-5239b1af.pth) from @sunjiahao1999 ,but the performance is worse.

bevfusion_lidar_cam_voxel0075_second_secfpn_8xb4_cyclic_20e_nus_3d

The pred_score_thr is set to 0.0;

The env I tried includes: python=3.9, pytorch=2.0.1, mmcv=2.1.0, mmengine=0.9.0, mmdet=3.2.0, mmdet3d=1.3.0, and python=3.9, pytorch=2.1.1, mmcv=2.1.0, mmengine=0.9.0, mmdet=3.2.0, mmdet3d=1.2.0, the mmdet3d v1.3.0 is downloaded from github repository Tags: v1.3.0, and the mmdet3d v1.2.0 is downloaded from github repository Branches: dev-1.x;

system: Ubuntu 20.04, gpu: 4090, cuda version: 11.8.

HydrogenWasser commented 1 year ago

hi, i've tried both bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d-5239b1af.pth and bevfusion_converted.pth, but both returned "The model and loaded state dict do not match exactly" as posted above.

I am using the latest version of mm3d, I wonder which version of mm3d are you using when you can succesfully run the demo?

zxzheng826 commented 11 months ago

I found out what go wrong. It seems it is crucial to install spconv 2.0 to run BEVFusion project correctly. It is mentioned in installing mmdet3d guideline Note 2. After I install cumm-cuxx and spconv-cuxx with pip, the problem fade instantly. So checking spconv 2.0 flag in environment may help you to diagnose.

HydrogenWasser commented 11 months ago

many thx. i'll go check and give it a try.

jeromefan commented 11 months ago

I found out what go wrong. It seems it is crucial to install spconv 2.0 to run BEVFusion project correctly. It is mentioned in installing mmdet3d guideline Note 2. After I install cumm-cuxx and spconv-cuxx with pip, the problem fade instantly. So checking spconv 2.0 flag in environment may help you to diagnose.

@zxzheng826 very useful with the checkpoint provided by @sunjiahao1999, can't thx more!!!! 牛逼!!!

nusc

open-mmlab / mmdetection3d