JHLee0513 / semantic_bevnet

Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving (https://openreview.net/forum?id=AL4FPs84YdQ) (CoRL 2021)
58 stars 12 forks source link

Size mismatch when running pre-trained model #2

Open TankyFranky opened 2 years ago

TankyFranky commented 2 years ago

Hi guys,

Very interesting paper. I was looking to duplicate your results and test your pre-trained model with some of my own point cloud data. I am having a pretty difficult time getting the example running. The main issues where with spconv versions (I assumed you guys used V1 since V2 has changed VoxelGenerator to PointToVoxel) but I solved those.

When loading the pre-trained weights using the information under 'Running the pretrained models' I am getting the following error.

Traceback (most recent call last): File "test_recurrent.py", line 28, in <module> model = BEVNetRecurrent(MODEL_FILE) File "/home/franc/Semantic_BEVNet/semantic_bevnet/bevnet/inference.py", line 97, in __init__ super(BEVNetRecurrent, self).__init__(*args, **kwargs) File "/home/franc/Semantic_BEVNet/semantic_bevnet/bevnet/inference.py", line 31, in __init__ self._load(weights_file, device) File "/home/franc/Semantic_BEVNet/semantic_bevnet/bevnet/inference.py", line 53, in _load net.load_state_dict(state_dict['nets'][name]) File "/home/franc/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for SpMiddleNoDownsampleXYMultiStep: size mismatch for middle_conv.0.weight: copying a param with shape torch.Size([3, 3, 3, 4, 32]) from checkpoint, the shape in current model is torch.Size([32, 3, 3, 3, 4]). size mismatch for middle_conv.3.weight: copying a param with shape torch.Size([3, 3, 3, 32, 32]) from checkpoint, the shape in current model is torch.Size([32, 3, 3, 3, 32]). size mismatch for middle_conv.6.weight: copying a param with shape torch.Size([3, 3, 3, 32, 64]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3, 32]). size mismatch for middle_conv.9.weight: copying a param with shape torch.Size([3, 3, 3, 64, 64]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3, 64]). size mismatch for middle_conv.12.weight: copying a param with shape torch.Size([3, 3, 3, 64, 64]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3, 64]). size mismatch for middle_conv.15.weight: copying a param with shape torch.Size([3, 3, 3, 64, 64]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3, 64]). size mismatch for middle_conv.18.weight: copying a param with shape torch.Size([3, 3, 3, 64, 64]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3, 64]). size mismatch for middle_conv.21.weight: copying a param with shape torch.Size([3, 3, 3, 64, 64]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3, 64]). size mismatch for middle_conv.24.weight: copying a param with shape torch.Size([3, 3, 3, 64, 64]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3, 64]). size mismatch for middle_conv.27.weight: copying a param with shape torch.Size([3, 3, 3, 64, 64]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3, 64]). size mismatch for middle_conv.30.weight: copying a param with shape torch.Size([3, 1, 1, 64, 64]) from checkpoint, the shape in current model is torch.Size([64, 3, 1, 1, 64]).

It looks like the model is all there but is in the wrong shape. Wondering if there is a quick fix for this? I havent modified the cloned repo at all so I am trying to find the reason the example won't run as outlined.

Thanks in advance.

guanglei96 commented 1 year ago

Hello, I have encountered the same problem. Have you solved it? Can you tell me how to solve it? I will be extremely grateful

TankyFranky commented 1 year ago

@MasterLei9527 I ended up solving the issue by building spconv V1.2.1 from source. I don't exactly remember the step by step instructions since this project was months ago and have moved on to other projects. spconv V1 is a requirement. V2 is the cause of this error.

MA-xiaowen commented 1 year ago

Hello, I'm very sorry to bother you. After seeing your reply, you used your own data for experiments. Are you using rosbag type data? Or is it data similar to the kitti data structure? If you were used rosbag, can you share your experience? Thanks in advance.