traveller59 / spconv

Spatial Sparse Convolution Library
Apache License 2.0
1.86k stars 363 forks source link

nvrtc: error: invalid value for --gpu-architecture (-arch) #566

Open linhaojia13 opened 1 year ago

linhaojia13 commented 1 year ago

Hi, I install spconv-cu102 via pip and it return nvrtc errors when I use it. The config of the environment is as follows:

gpu: Tesla V100-SXM2-32GB-LS
nvidia driver: 450.80.02
python: 3.8
pytorch: 1.10.1
cudatoolkit: 10.2

The detailed errors are as follows:

[2023-02-27 18:32:36,065 INFO defaults.py line 200 43472] >>>>>>>>>>>>>>>> Start Training >>>>>>>>>>>>>>>>
nvrtc: error: invalid value for --gpu-architecture (-arch)

[Exception|implicit_gemm]feat=torch.Size([117713, 6]),w=torch.Size([32, 5, 5, 5, 6]),pair=torch.Size([125, 117713]),act=117713,issubm=True,istrain=True
SPCONV_DEBUG_SAVE_PATH not found, you can specify SPCONV_DEBUG_SAVE_PATH as debug data save path to save debug data which can be attached in a issue.
Traceback (most recent call last):
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/root/.vscode-server/extensions/ms-python.python-2023.2.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
    cli.main()
  File "/root/.vscode-server/extensions/ms-python.python-2023.2.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/root/.vscode-server/extensions/ms-python.python-2023.2.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/root/.vscode-server/extensions/ms-python.python-2023.2.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/root/.vscode-server/extensions/ms-python.python-2023.2.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/root/.vscode-server/extensions/ms-python.python-2023.2.0/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "/userhome/lhj/pointcloud/PointTransformerV2-main/tools/train.py", line 34, in <module>
    main()
  File "/userhome/lhj/pointcloud/PointTransformerV2-main/tools/train.py", line 23, in main
    launch(
  File "/userhome/lhj/pointcloud/PointTransformerV2-main/tools/../pcr/engines/launch.py", line 87, in launch
    main_func(*cfg)
  File "/userhome/lhj/pointcloud/PointTransformerV2-main/tools/train.py", line 16, in main_worker
    trainer.train()
  File "/userhome/lhj/pointcloud/PointTransformerV2-main/tools/../pcr/engines/defaults.py", line 213, in train
    self.run_step(i, input_dict)
  File "/userhome/lhj/pointcloud/PointTransformerV2-main/tools/../pcr/engines/defaults.py", line 228, in run_step
    output = self.model(input_dict)
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/userhome/lhj/pointcloud/PointTransformerV2-main/tools/../pcr/models/sparse_unet/spconv_unet.py", line 179, in forward
    x = self.conv_input(x)
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/spconv/pytorch/modules.py", line 138, in forward
    input = module(input)
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/spconv/pytorch/conv.py", line 741, in forward
    return self._conv_forward(self.training, input, self.weight, self.bias, add_input,
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/spconv/pytorch/conv.py", line 463, in _conv_forward
    out_features = Fsp.implicit_gemm(
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/torch/cuda/amp/autocast_mode.py", line 92, in decorate_fwd
    return fwd(*_cast(args, cast_inputs), **_cast(kwargs, cast_inputs))
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/spconv/pytorch/functional.py", line 224, in forward
    raise e
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/spconv/pytorch/functional.py", line 210, in forward
    out, mask_out, mask_width = ops.implicit_gemm(
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/spconv/pytorch/ops.py", line 1513, in implicit_gemm
    mask_width, tune_res_cpp = ConvGemmOps.implicit_gemm(
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/spconv/algo.py", line 208, in cached_get_nvrtc_params
    mod, ker = self._compile_nvrtc_module(desp)
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/spconv/algo.py", line 196, in _compile_nvrtc_module
    mod = CummNVRTCModule([kernel],
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/cumm/nvrtc/__init__.py", line 308, in __init__
    super().__init__(mod_params.code,
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/cumm/nvrtc/__init__.py", line 208, in __init__
    super().__init__(code,
  File "/userhome/conda/envs/pcr_torch1101cuda102/lib/python3.8/site-packages/cumm/tensorview/__init__.py", line 174, in __init__
    self._mod = _NVRTCModule(code, headers, opts, program_name,
RuntimeError: /io/include/tensorview/cuda/nvrtc.h(96)
compileResult == NVRTC_SUCCESS assert faild. nvrtc compile failed.
fjh1228 commented 7 months ago

same problem

zyz654118820 commented 5 months ago

same problem me too. do u solve it now?