As shown in the figure, I show a loss function image of my training process and a test image of my final training result. Here's what I got back when I tested my training results:
(openmmlab) liyf@l526-System-Product-Name:~/mmdetection3d$ python demo/pcd_demo.py demo/data/kitti/000008.bin pointpillars_hv_secfpn_8xb6-160e_kitti-3d-car.py "/home/liyf/epoch_80.pth" --show
/home/liyf/mmdetection3d/mmdet3d/models/dense_heads/anchor3d_head.py:94: UserWarning: dir_offset and dir_limit_offset will be depressed and be incorporated into box coder in the future
warnings.warn(
Loads checkpoint by local backend from path: /home/liyf/epoch_80.pth
The model and loaded state dict do not match exactly
size mismatch for bbox_head.conv_cls.weight: copying a param with shape torch.Size([18, 384, 1, 1]) from checkpoint, the shape in current model is torch.Size([2, 384, 1, 1]).
size mismatch for bbox_head.conv_cls.bias: copying a param with shape torch.Size([18]) from checkpoint, the shape in current model is torch.Size([2]).
size mismatch for bbox_head.conv_reg.weight: copying a param with shape torch.Size([42, 384, 1, 1]) from checkpoint, the shape in current model is torch.Size([14, 384, 1, 1]).
size mismatch for bbox_head.conv_reg.bias: copying a param with shape torch.Size([42]) from checkpoint, the shape in current model is torch.Size([14]).
size mismatch for bbox_head.conv_dir_cls.weight: copying a param with shape torch.Size([12, 384, 1, 1]) from checkpoint, the shape in current model is torch.Size([4, 384, 1, 1]).
size mismatch for bbox_head.conv_dir_cls.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([4]).
/home/liyf/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/visualization/visualizer.py:196: UserWarning: Failed to add <class 'mmengine.visualization.vis_backend.LocalVisBackend'>, please provide the save_dir argument.
warnings.warn(f'Failed to add {vis_backend.class}, '
/home/liyf/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2894.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
From the loss curve, it seems that there is no problem in the training process of the model, but my test result is obviously wrong. May I ask how to solve this problem?
As shown in the figure, I show a loss function image of my training process and a test image of my final training result. Here's what I got back when I tested my training results:
(openmmlab) liyf@l526-System-Product-Name:~/mmdetection3d$ python demo/pcd_demo.py demo/data/kitti/000008.bin pointpillars_hv_secfpn_8xb6-160e_kitti-3d-car.py "/home/liyf/epoch_80.pth" --show /home/liyf/mmdetection3d/mmdet3d/models/dense_heads/anchor3d_head.py:94: UserWarning: dir_offset and dir_limit_offset will be depressed and be incorporated into box coder in the future warnings.warn( Loads checkpoint by local backend from path: /home/liyf/epoch_80.pth The model and loaded state dict do not match exactly
size mismatch for bbox_head.conv_cls.weight: copying a param with shape torch.Size([18, 384, 1, 1]) from checkpoint, the shape in current model is torch.Size([2, 384, 1, 1]). size mismatch for bbox_head.conv_cls.bias: copying a param with shape torch.Size([18]) from checkpoint, the shape in current model is torch.Size([2]). size mismatch for bbox_head.conv_reg.weight: copying a param with shape torch.Size([42, 384, 1, 1]) from checkpoint, the shape in current model is torch.Size([14, 384, 1, 1]). size mismatch for bbox_head.conv_reg.bias: copying a param with shape torch.Size([42]) from checkpoint, the shape in current model is torch.Size([14]). size mismatch for bbox_head.conv_dir_cls.weight: copying a param with shape torch.Size([12, 384, 1, 1]) from checkpoint, the shape in current model is torch.Size([4, 384, 1, 1]). size mismatch for bbox_head.conv_dir_cls.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([4]). /home/liyf/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/visualization/visualizer.py:196: UserWarning: Failed to add <class 'mmengine.visualization.vis_backend.LocalVisBackend'>, please provide the
save_dir
argument. warnings.warn(f'Failed to add {vis_backend.class}, ' /home/liyf/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2894.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]From the loss curve, it seems that there is no problem in the training process of the model, but my test result is obviously wrong. May I ask how to solve this problem?