mumianyuxin / M3DSSD

M3DSSD: Monocular 3D Single Stage Object Detector
MIT License
69 stars 8 forks source link

CUDA version #5

Open kdheejb7 opened 2 years ago

kdheejb7 commented 2 years ago

Hello,

Can you please let me know what version of cuda you used?

The first case, I used torch==0.4.1 and cuda 10.0. When I run the command python3 scripts/train_rpn_3d.py --config=kitti_3d_base --exp_name base I got the following error

  File "scripts/train_rpn_3d.py", line 324, in <module>
    main(args)
  File "scripts/train_rpn_3d.py", line 140, in main
    rpn_net, optimizer = init_training_model(conf, paths.output)
  File "/workspace/lib/core.py", line 69, in init_training_model
    network = absolute_import(dst_path)
  File "/workspace/lib/util.py", line 98, in absolute_import
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/workspace/output/base/20210923_152833/M3d_inference_align.py", line 5, in <module>
    from model.pose_dla_dcn import DLASeg, DeformConv
  File "/workspace/model/pose_dla_dcn.py", line 17, in <module>
    from .DCNv2.dcn_v2 import DCN
  File "/workspace/model/DCNv2/dcn_v2.py", line 11, in <module>
    from .dcn_v2_func import DCNv2Function
  File "/workspace/model/DCNv2/dcn_v2_func.py", line 9, in <module>
    from ._ext import dcn_v2 as _backend
  File "/workspace/model/DCNv2/_ext/dcn_v2/__init__.py", line 3, in <module>
    from ._dcn_v2 import lib as _lib, ffi as _ffi
ImportError: /workspace/model/DCNv2/_ext/dcn_v2/_dcn_v2.so: undefined symbol: __cudaPopCallConfiguration

I knew that this error is because of cuda and torch version mismatch, so I thought if I change the cuda version from 10.0 to 9.2, I can solve this error.

But I have another problem with cuda 9.2

The second case, I used torch==0.4.1 and cuda 9.2 When I run the command python3 scripts/train_rpn_3d.py --config=kitti_3d_base --exp_name base I got the following error

Traceback (most recent call last):
  File "scripts/train_rpn_3d.py", line 23, in <module>
    from lib.imdb_util import *
  File "/workspace/lib/imdb_util.py", line 24, in <module>
    from lib.rpn_util import *
  File "/workspace/lib/rpn_util.py", line 16, in <module>
    from lib.nms.gpu_nms import gpu_nms
ImportError: libcudart.so.10.0: cannot open shared object file: No such file or directory

This error occurred before reaching the code line that caused the first case error. So I cannot solve the problem about the command for training. I tried to use cuda 9.2 and 10.0, but each caused one problem.

Can you please let me know what version of cuda you used?

Thank you!

Wasiiiii commented 2 years ago

HY kdheejb7 Do u solve the version problem??

123456789live commented 2 years ago

你好,请问版本问题你解决了吗。我也有相似的问题。 @kdheejb7