Open poodarchu opened 6 years ago
the sparseconvnet file modification only needed when you evaluate pretrained models. you Need to use tools like pdb or gdb to locate the segfault location, i can't solve this with only segfault information.
I still got "(gdb) backtrace
from /home/users/benjin.zhu/data2/second/second.pth/second/core/non_max_suppression/nms.so
I've fixed the above problem.
But now I got trouble with sparseconvnet, after I compile it with gcc 5.4 and pytorch 1.0, it said:
Installed /data-sdb/benjin.zhu/libs/anaconda3.6/lib/python3.6/site-packages/sparseconvnet-0.2-py3.6-linux-x86_64.egg Processing dependencies for sparseconvnet==0.2 Finished processing dependencies for sparseconvnet==0.2
but then I tried python examples/hello-world.py, I got
$ python examples/hello-world.py Traceback (most recent call last): File "examples/hello-world.py", line 30, in <module> input = scn.InputBatch(2, inputSpatialSize) File "/data-sdb/benjin.zhu/second/SparseConvNet/sparseconvnet/inputBatch.py", line 19, in __init__ self.metadata = Metadata(dimension) File "/data-sdb/benjin.zhu/second/SparseConvNet/sparseconvnet/metadata.py", line 17, in Metadata return getattr(sparseconvnet.SCN, 'Metadata_%d' % dim)() AttributeError: module 'sparseconvnet.SCN' has no attribute 'Metadata_2'
This stucks me for a long time.
I have no idea, but someone asked me previously:
The Metadata_3 problem is caused by wrong import of sparseconvnet.
Actually I shouldn't add the path of SparseConvNet to PYTHONPATH, which will cause the wrong import of sparseconvnet.
I am using older sparseconvnet edf89af339ee929d9416f3509ff405450949f606
with pytorch 0.4.1.
I am using older sparseconvnet
edf89af339ee929d9416f3509ff405450949f606
with pytorch 0.4.1.
That's very userful .
You forgot to change corresponding documents in Sparsenet as instructed in readme.
I have no idea, but someone asked me previously:
The Metadata_3 problem is caused by wrong import of sparseconvnet. Actually I shouldn't add the path of SparseConvNet to PYTHONPATH, which will cause the wrong import of sparseconvnet.
I am using older sparseconvnet
edf89af339ee929d9416f3509ff405450949f606
with pytorch 0.4.1.
In the readme file, you asked that we need to use pytorch 1.0. So do we need to downgrade to pytorch 0.4.1 and install sparseconvnet ?
@songanz The SparseConvNet is deprecated in newest code. you need to use spconv instead.
I still got "(gdb) backtrace
0 0x00007fff341d10df in pybind11::cpp_function::dispatcher(_object, _object, _object*) ()
from /home/users/benjin.zhu/data2/second/second.pth/second/core/non_max_suppression/nms.so
1 0x0000555555662b94 in _PyCFunction_FastCallDict ()
2 0x00005555556f267c in call_function ()
3 0x0000555555714cba in _PyEval_EvalFrameDefault ()
4 0x00005555556ec70b in fast_function ()"
== I debug my signal sigsegev, Segment falut error, and got same output with you. could you tell us how to solve this problem? I use torch1.0 + cuda 9.0 +gcc 4.9.2.
when I tried to evalute the trained model, I got:
python ./pytorch/train.py evaluate --config_path=./configs/car.config --model_dir=./data/models /home/users/benjin.zhu/data2/libs/anaconda3.6/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6 return f(*args, **kwds) /home/users/benjin.zhu/data2/libs/anaconda3.6/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from
floatto
np.floatingis deprecated. In future, it will be treated as
np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters [ 11 400 352] Restoring parameters from data/models/voxelnet-0.tckpt remain number of infos: 3769 Generate output labels... [1] 20341 segmentation fault python ./pytorch/train.py evaluate --config_path=./configs/car.config
Then I modify convolution.py and submanifoldConvolution.py as the README, I got:
RuntimeError: Error(s) in loading state_dict for VoxelNet: size mismatch for middle_feature_extractor.middle_conv.0.weight: copying a param of torch.Size([27, 128, 64]) from checkpoint, where the shape is torch.Size([3456, 64]) in current model. size mismatch for middle_feature_extractor.middle_conv.2.weight: copying a param of torch.Size([3, 64, 64]) from checkpoint, where the shape is torch.Size([192, 64]) in current model. size mismatch for middle_feature_extractor.middle_conv.4.weight: copying a param of torch.Size([27, 64, 64]) from checkpoint, where the shape is torch.Size([1728, 64]) in current model. size mismatch for middle_feature_extractor.middle_conv.6.weight: copying a param of torch.Size([27, 64, 64]) from checkpoint, where the shape is torch.Size([1728, 64]) in current model. size mismatch for middle_feature_extractor.middle_conv.8.weight: copying a param of torch.Size([3, 64, 64]) from checkpoint, where the shape is torch.Size([192, 64]) in current model.