dvlab-research / spconv-plus

Apache License 2.0
154 stars 7 forks source link

please help me slove this problem #6

Open ACFIRSTONE opened 1 year ago

ACFIRSTONE commented 1 year ago

cuda 113, python 3.8, pytorch 1.11.0, spconv-plus 2.1.21

import spconv.pytorch Traceback (most recent call last): File "", line 1, in File "/home/suwei/anaconda3/envs/max_voxelnext/lib/python3.8/site-packages/spconv/pytorch/init.py", line 6, in from spconv.pytorch.core import SparseConvTensor File "/home/suwei/anaconda3/envs/max_voxelnext/lib/python3.8/site-packages/spconv/pytorch/core.py", line 21, in from spconv.tools import CUDAKernelTimer File "/home/suwei/anaconda3/envs/max_voxelnext/lib/python3.8/site-packages/spconv/tools.py", line 16, in from spconv.cppconstants import CPU_ONLY_BUILD File "/home/suwei/anaconda3/envs/max_voxelnext/lib/python3.8/site-packages/spconv/cppconstants.py", line 15, in import spconv.core_cc as _ext ImportError: arg(): could not convert default argument 'timer: tv::CUDAKernelTimer' in method '<class 'spconv.core_cc.cumm.gemm.main.GemmParams'>.init' into a Python object (type not registered yet?)

ACFIRSTONE commented 1 year ago

I solved this bug by upgrading gcc from 7 to 9,But there is a new bug Traceback (most recent call last): File "train.py", line 229, in main() File "train.py", line 175, in main train_model( File "/home/suwei/suwei_ws/mm/VoxelNeXt/tools/train_utils/train_utils.py", line 173, in train_model accumulated_iter = train_one_epoch( File "/home/suwei/suwei_ws/mm/VoxelNeXt/tools/train_utils/train_utils.py", line 56, in train_one_epoch loss, tb_dict, disp_dict = model_func(model, batch) File "../pcdet/models/init.py", line 42, in model_func ret_dict, tb_dict, disp_dict = model(batch_dict) File "/home/suwei/anaconda3/envs/mmvoxel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, kwargs) File "../pcdet/models/detectors/voxelnext.py", line 13, in forward batch_dict = cur_module(batch_dict) File "/home/suwei/anaconda3/envs/mmvoxel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, *kwargs) File "../pcdet/models/backbones_3d/spconv_backbone_voxelnext.py", line 185, in forward x = self.conv_input(input_sp_tensor) File "/home/suwei/anaconda3/envs/mmvoxel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/home/suwei/anaconda3/envs/mmvoxel/lib/python3.8/site-packages/spconv/pytorch/modules.py", line 137, in forward input = module(input) File "/home/suwei/anaconda3/envs/mmvoxel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, *kwargs) File "/home/suwei/anaconda3/envs/mmvoxel/lib/python3.8/site-packages/spconv/pytorch/conv.py", line 487, in forward out_features = Fsp.implicit_gemm( File "/home/suwei/anaconda3/envs/mmvoxel/lib/python3.8/site-packages/torch/cuda/amp/autocast_mode.py", line 118, in decorate_fwd return fwd(args, **kwargs) File "/home/suwei/anaconda3/envs/mmvoxel/lib/python3.8/site-packages/spconv/pytorch/functional.py", line 205, in forward raise e File "/home/suwei/anaconda3/envs/mmvoxel/lib/python3.8/site-packages/spconv/pytorch/functional.py", line 190, in forward out, mask_out, mask_width = ops.implicit_gemm(features, filters, File "/home/suwei/anaconda3/envs/mmvoxel/lib/python3.8/site-packages/spconv/pytorch/ops.py", line 1861, in implicit_gemm tuneres, = CONV.tune_and_cache( File "/home/suwei/anaconda3/envs/mmvoxel/lib/python3.8/site-packages/spconv/algo.py", line 618, in tune_and_cache inp = inp.clone() ValueError: /io/include/tensorview/tensor.h(171) don't compiled with cuda

hontrn9122 commented 6 months ago

You have to specify your Cuda version before compiling cumm and spconv-plus.

export CUMM_CUDA_VERSION="your cuda version" export CUMM_DISABLE_JIT="1" export SPCONV_DISABLE_JIT="1"

Now you can rebuild and install cumm and spconv-plus from source. It should work fine.