Open themousepotato opened 4 years ago
I have encounted the same problems. Have you solved it?
@USTC-Keyanjie Nope, haven't solved it.
Are you on master branch? The pretrained model is for second V1.5, and there have been breaking changes since then.
@jhultman I'm on master
branch. What did you mean by pretrained model on V1.5. I've checked V1.5
branch. Haven't seen that. The pretrained weights are in the google drive link, right? Also, did you get that to work?
@jhultman I'm sure that I'm on the master
branch, and the pretrained model weight file is on version 1.5. But this problem still happened.
I don't think the pretrained model is compatible with the master branch since that model was trained using the V1.5 codebase. Use the V1.5 branch if you need the pretrained model.
Hi @USTC-Keyanjie, I think he is suggesting switching to branch v1.5
and not the master
branch.
However, I tried switching to branch v1.5
but still it doesn't solve the problem. @jhultman Could you let us know how did you solve the problem?
@witignite Can you post your error message? I still think the problem discussed in this GitHub issue can be solved by re-cloning second.pytorch
and executing git checkout v1.5
. I just tried cloning this repo on a linux machine, configuring new conda environment with torch 1.1
, checking out v1.5
branch of second.pytorch
, and using spconv
git commit 7342772
. I could then successfully run the following code snippet:
import torch
from second.protos import pipeline_pb2
from google.protobuf import text_format
from second.builder import target_assigner_builder, voxel_builder
from second.pytorch.builder import second_builder, box_coder_builder
config_path = './configs/car.fhd.config'
config = pipeline_pb2.TrainEvalPipelineConfig()
with open(config_path, "r") as f:
proto_str = f.read()
text_format.Merge(proto_str, config)
model_cfg = config.model.second
target_assigner_cfg = model_cfg.target_assigner
voxel_generator = voxel_builder.build(model_cfg.voxel_generator)
bv_range = voxel_generator.point_cloud_range[[0, 1, 3, 4]]
box_coder = box_coder_builder.build(model_cfg.box_coder)
target_assigner = target_assigner_builder.build(target_assigner_cfg, bv_range, box_coder)
net = second_builder.build(model_cfg, voxel_generator, target_assigner)
fpath = './pretrained_models_v1.5/car_fhd/voxelnet-74280.tckpt'
ckpt = torch.load(fpath)
keys_ckpt = set(ckpt.keys())
keys_net = set(net.state_dict().keys())
assert keys_ckpt == keys_net, 'Keys do not match'
net.load_state_dict(ckpt)
print('Restored successfully...')
I was also able to replicate the original error message in this thread by purposely using the master branch config file with the v1.5
pretrained weights:
RuntimeError: Error(s) in loading state_dict for VoxelNet:
Unexpected key(s) in state_dict: "rpn.blocks.1.1.weight", "rpn.blocks.1.2.weight" ...
size mismatch for rpn.deblocks.0.0.weight: copying a param with shape torch.Size([128, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 1, 1]) ...
@jhultman I'm getting the following error on running after switching to v1.5
:
(base) navaneeth@mousebox:~/workspace/3d-object-detection/second.pytorch.v1.5/second$ python pytorch/train.py evaluate --config_path=configs/car.fhd.config --model_dir=second --measure_time=True --batch_size=1
/home/navaneeth/anaconda3/lib/python3.7/site-packages/dask/config.py:168: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
data = yaml.load(f.read()) or {}
Traceback (most recent call last):
File "pytorch/train.py", line 689, in <module>
fire.Fire()
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 471, in _Fire
target=component.__name__)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 675, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "pytorch/train.py", line 573, in evaluate
text_format.Merge(proto_str, config)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/google/protobuf/text_format.py", line 693, in Merge
allow_unknown_field=allow_unknown_field)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/google/protobuf/text_format.py", line 760, in MergeLines
return parser.MergeLines(lines, message)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/google/protobuf/text_format.py", line 785, in MergeLines
self._ParseOrMerge(lines, message)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/google/protobuf/text_format.py", line 807, in _ParseOrMerge
self._MergeField(tokenizer, message)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/google/protobuf/text_format.py", line 932, in _MergeField
merger(tokenizer, message, field)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/google/protobuf/text_format.py", line 1006, in _MergeMessageField
self._MergeField(tokenizer, sub_message)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/google/protobuf/text_format.py", line 932, in _MergeField
merger(tokenizer, message, field)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/google/protobuf/text_format.py", line 1006, in _MergeMessageField
self._MergeField(tokenizer, sub_message)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/google/protobuf/text_format.py", line 899, in _MergeField
(message_descriptor.full_name, name))
google.protobuf.text_format.ParseError: 58:5 : Message type "second.protos.VoxelNet" has no field named "use_aux_classifier".
I also changed to V1.5 bit still get the same Error:
RuntimeError: Error(s) in loading state_dict for VoxelNet: Unexpected key(s) in state_dict: "rpn.blocks.1.1.weight", "rpn.blocks.1.2.weight", "rpn.blocks.1.2.bias", "rpn.blocks.1.2.running_mean", "rpn.blocks.1.2.running_var", "rpn.blocks.1.2.num_batches_tracked", "rpn.blocks.1.4.weight", "rpn.blocks.1.5.weight", "rpn.blocks.1.5.bias", "rpn.blocks.1.5.running_mean", "rpn.blocks.1.5.running_var", "rpn.blocks.1.5.num_batches_tracked", "rpn.blocks.1.7.weight", "rpn.blocks.1.8.weight", "rpn.blocks.1.8.bias", "rpn.blocks.1.8.running_mean", "rpn.blocks.1.8.running_var", "rpn.blocks.1.8.num_batches_tracked", "rpn.blocks.1.10.weight", "rpn.blocks.1.11.weight", "rpn.blocks.1.11.bias", "rpn.blocks.1.11.running_mean", "rpn.blocks.1.11.running_var", "rpn.blocks.1.11.num_batches_tracked", "rpn.blocks.1.13.weight", "rpn.blocks.1.14.weight", "rpn.blocks.1.14.bias", "rpn.blocks.1.14.running_mean", "rpn.blocks.1.14.running_var", "rpn.blocks.1.14.num_batches_tracked", "rpn.blocks.1.16.weight", "rpn.blocks.1.17.weight", "rpn.blocks.1.17.bias", "rpn.blocks.1.17.running_mean", "rpn.blocks.1.17.running_var", "rpn.blocks.1.17.num_batches_tracked", "rpn.deblocks.1.0.weight", "rpn.deblocks.1.1.weight", "rpn.deblocks.1.1.bias", "rpn.deblocks.1.1.running_mean", "rpn.deblocks.1.1.running_var", "rpn.deblocks.1.1.num_batches_tracked". size mismatch for rpn.deblocks.0.0.weight: copying a param with shape torch.Size([128, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 1, 1]). size mismatch for rpn.deblocks.0.1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for rpn.deblocks.0.1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for rpn.deblocks.0.1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for rpn.deblocks.0.1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for rpn.conv_cls.weight: copying a param with shape torch.Size([2, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([2, 128, 1, 1]). size mismatch for rpn.conv_box.weight: copying a param with shape torch.Size([14, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([14, 128, 1, 1]). size mismatch for rpn.conv_dir_cls.weight: copying a param with shape torch.Size([4, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([4, 128, 1, 1]).
@jhultman sorry for my last error message. i'd forgotten to run create_data.py
after switching to v1.5
branch. now, i ran your code snippet and it's working successfully and i got the following output:
[ 41 1600 1408]
Restored successfully...
however, $ python pytorch/train.py evaluate --config_path=configs/car.fhd.config --model_dir=second --measure_time=True --batch_size=1
is giving the following error:
/home/navaneeth/anaconda3/lib/python3.7/site-packages/dask/config.py:168: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
data = yaml.load(f.read()) or {}
[ 41 1600 1408]
Restoring parameters from second/voxelnet-74280.tckpt
remain number of infos: 3769
Generate output labels...
Traceback (most recent call last):
File "pytorch/train.py", line 689, in <module>
fire.Fire()
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 471, in _Fire
target=component.__name__)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 675, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "pytorch/train.py", line 635, in evaluate
for example in iter(eval_dataloader):
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in __next__
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in <listcomp>
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/pytorch/builder/input_reader_builder.py", line 18, in __getitem__
return self._dataset[idx]
File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/data/dataset.py", line 70, in __getitem__
prep_func=self._prep_func)
File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/data/preprocess.py", line 344, in _read_and_prep_v9
example = prep_func(input_dict=input_dict)
File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/data/preprocess.py", line 233, in prep_pointcloud
points, max_voxels)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/spconv/utils/__init__.py", line 164, in generate
or self._max_voxels, self._full_mean)
File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/spconv/utils/__init__.py", line 65, in points_to_voxel
assert block_filtering is False
AssertionError
@themousepotato I think your issue now comes from using incompatible version of spconv
library. Try installing spconv
using the git hash I mentioned in my previous reply. The latest version of spconv
will not work.
@jhultman awesome! i reinstalled spconv
module after checking out to the commit number that you have mentioned. it works fine now. kindly update the README file. thanks!
I think there is no pre-trained model for Nuscenes in the branch v1.5, right?
Sorry, I mean no config files for nuscenes in the branch v1.5.
@themousepotato I think your issue now comes from using incompatible version of
spconv
library. Try installingspconv
using the git hash I mentioned in my previous reply. The latest version ofspconv
will not work.
so apparently you will not solve all issues with checking out commit 7342772
.
I recently encountered another issue with spconv
whilst trying to edit the voxel_size
in .config
files. If I change the voxel size from
# voxel_size : [0.05, 0.05, 0.1] # fine values
to
voxel_size : [0.2, 0.2, 0.4] # [width, height, depth] official values
I end up with
N > 0 assert faild. CUDA kernel launch blocks must be positive, but got N= 0
see https://github.com/traveller59/second.pytorch/issues/233.
This might be -- according to @traveller59 -- a problem of using an outdated version of spconv
.
Apparently only the voxel_size: [0.05, 0.05, 0.1]
works, regardless of the point_cloud_range
? Strange. :-(
edit: If I switch to some other version, like spconv v1.1, I get the AssertionError of @themousepotato
Hi, I have compiled the newest spconv and I am still facing the same problem. Any idea what could be the issue?
@jhultman sorry for my last error message. i'd forgotten to run
create_data.py
after switching tov1.5
branch. now, i ran your code snippet and it's working successfully and i got the following output:[ 41 1600 1408] Restored successfully...
however,
$ python pytorch/train.py evaluate --config_path=configs/car.fhd.config --model_dir=second --measure_time=True --batch_size=1
is giving the following error:/home/navaneeth/anaconda3/lib/python3.7/site-packages/dask/config.py:168: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. data = yaml.load(f.read()) or {} [ 41 1600 1408] Restoring parameters from second/voxelnet-74280.tckpt remain number of infos: 3769 Generate output labels... Traceback (most recent call last): File "pytorch/train.py", line 689, in <module> fire.Fire() File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 138, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 471, in _Fire target=component.__name__) File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 675, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "pytorch/train.py", line 635, in evaluate for example in iter(eval_dataloader): File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in __next__ batch = self.collate_fn([self.dataset[i] for i in indices]) File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in <listcomp> batch = self.collate_fn([self.dataset[i] for i in indices]) File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/pytorch/builder/input_reader_builder.py", line 18, in __getitem__ return self._dataset[idx] File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/data/dataset.py", line 70, in __getitem__ prep_func=self._prep_func) File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/data/preprocess.py", line 344, in _read_and_prep_v9 example = prep_func(input_dict=input_dict) File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/data/preprocess.py", line 233, in prep_pointcloud points, max_voxels) File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/spconv/utils/__init__.py", line 164, in generate or self._max_voxels, self._full_mean) File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/spconv/utils/__init__.py", line 65, in points_to_voxel assert block_filtering is False AssertionError
I'm still facing the same issue here, any solutions for that?
Hi, I have compiled the newest spconv and I am still facing the same problem. Any idea what could be the issue?
@jhultman sorry for my last error message. i'd forgotten to run
create_data.py
after switching tov1.5
branch. now, i ran your code snippet and it's working successfully and i got the following output:[ 41 1600 1408] Restored successfully...
however,
$ python pytorch/train.py evaluate --config_path=configs/car.fhd.config --model_dir=second --measure_time=True --batch_size=1
is giving the following error:/home/navaneeth/anaconda3/lib/python3.7/site-packages/dask/config.py:168: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. data = yaml.load(f.read()) or {} [ 41 1600 1408] Restoring parameters from second/voxelnet-74280.tckpt remain number of infos: 3769 Generate output labels... Traceback (most recent call last): File "pytorch/train.py", line 689, in <module> fire.Fire() File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 138, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 471, in _Fire target=component.__name__) File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/fire/core.py", line 675, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "pytorch/train.py", line 635, in evaluate for example in iter(eval_dataloader): File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in __next__ batch = self.collate_fn([self.dataset[i] for i in indices]) File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in <listcomp> batch = self.collate_fn([self.dataset[i] for i in indices]) File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/pytorch/builder/input_reader_builder.py", line 18, in __getitem__ return self._dataset[idx] File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/data/dataset.py", line 70, in __getitem__ prep_func=self._prep_func) File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/data/preprocess.py", line 344, in _read_and_prep_v9 example = prep_func(input_dict=input_dict) File "/home/navaneeth/workspace/3d-object-detection/second.pytorch.v1.5/second/data/preprocess.py", line 233, in prep_pointcloud points, max_voxels) File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/spconv/utils/__init__.py", line 164, in generate or self._max_voxels, self._full_mean) File "/home/navaneeth/anaconda3/lib/python3.7/site-packages/spconv/utils/__init__.py", line 65, in points_to_voxel assert block_filtering is False AssertionError
Hi,
I'm getting the following error on trying to evaluate the model using the pretrained weights in
car_fhd
. If anyone has a workaround, kindly let me know. Thanks!