HorizonRobotics / Sparse4D

MIT License
326 stars 31 forks source link

run local_test.sh "slow_conv2d_cpu" not implemented for 'Half' #73

Closed xifen523 closed 3 months ago

xifen523 commented 4 months ago

runbash local_test.sh sparse4dv3_temporal_r50_1x8_bs6_256x704 path/*pt

BUG log:

No CUDA runtime is found, using CUDA_HOME='/root/miniconda3/envs/sprase4D' /root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details. warnings.warn( projects.mmdet3d_plugin {'version': 'JRDB_v1.0'} distributed: False /root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/mmdet/models/backbones/resnet.py:401: UserWarning: DeprecationWarning: pretrained is deprecated, please use "init_cfg" instead warnings.warn('DeprecationWarning: pretrained is deprecated, ' load checkpoint from local path: /root/sparse4D_track/work_dirs/JRDB_sparse4dv3_temporal_r50_1x8_bs6_480x768_local/iter_46480.pth [ ] 0/27661, elapsed: 0s, ETA:/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/torch/amp/autocast_mode.py:202: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling') Traceback (most recent call last): File "./tools/test.py", line 313, in main() File "./tools/test.py", line 258, in main outputs = single_gpu_test(model, data_loader, args.show, args.show_dir) File "/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/mmdet/apis/test.py", line 29, in single_gpu_test result = model(return_loss=False, rescale=True, data) File "/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, *kwargs) File "/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 49, in forward return self.module(inputs[0], kwargs[0]) File "/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, kwargs) File "/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 236, in new_func output = old_func(*new_args, new_kwargs) File "/root/sparse4D_track/projects/mmdet3d_plugin/models/sparse4d.py", line 97, in forward return self.forward_test(img, data) File "/root/sparse4D_track/projects/mmdet3d_plugin/models/sparse4d.py", line 113, in forward_test return self.simple_test(img, *data) File "/root/sparse4D_track/projects/mmdet3d_plugin/models/sparse4d.py", line 116, in simple_test feature_maps = self.extract_feat(img) File "/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 149, in new_func output = old_func(new_args, new_kwargs) File "/root/sparse4D_track/projects/mmdet3d_plugin/models/sparse4d.py", line 75, in extract_feat feature_maps = self.img_backbone(img) File "/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, *kwargs) File "/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/mmdet/models/backbones/resnet.py", line 636, in forward x = self.conv1(x) File "/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(input, **kwargs) File "/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 463, in forward return self._conv_forward(input, self.weight, self.bias) File "/root/miniconda3/envs/sprase4D/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: "slow_conv2d_cpu" not implemented for 'Half'

my torch version is :1.13.0

Name: torch Version: 1.13.0 Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration Home-page: https://pytorch.org/ Author: PyTorch Team Author-email: packages@pytorch.org License: BSD-3 Location: /root/miniconda3/envs/sprase4D/lib/python3.8/site-packages Requires: typing_extensions Required-by: torchaudio, torchvision

cuda vesrion:

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Tue_Mar__8_18:18:20_PST_2022 Cuda compilation tools, release 11.6, V11.6.124 Build cuda_11.6.r11.6/compiler.31057947_0

nvidia-smi:

+-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.67 Driver Version: 550.67 CUDA Version: 12.4 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3090 On | 00000000:20:00.0 Off | N/A | | 32% 39C P8 25W / 350W | 11MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 1 NVIDIA GeForce RTX 3090 On | 00000000:21:00.0 Off | N/A | | 32% 40C P8 20W / 350W | 11MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| +-----------------------------------------------------------------------------------------+

test cuda :

``import torch

print(torch.version)

print(torch.cuda.is_available())

print(torch.cuda.amp.common.amp_definitely_not_available()) ``

result

1.13.0 True False

linxuewu commented 3 months ago

This seems to be a common issue. It is recommended to look for answers in other forums. Maybe the cuda_home setting is incorrect.

shauloron commented 3 months ago

Make sure CUDA is available in you mmdet and mmcv packages: python /usr/local/lib/python3.8/dist-packages/mmdet/utils/collect_env.py to see that mmcv and mmdet support CUDA. You should see at the bottom of the printout: TorchVision: 0.14.0+cu117 OpenCV: 4.10.0 MMCV: 1.7.1 MMCV Compiler: GCC 9.3 MMCV CUDA Compiler: 11.7

If not try installing from this whl or build from source with required flags (solved the issue for me): pip install mmcv-full==1.7.0 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13/index.html good luck

xifen523 commented 3 months ago

Make sure CUDA is available in you mmdet and mmcv packages: python /usr/local/lib/python3.8/dist-packages/mmdet/utils/collect_env.py to see that mmcv and mmdet support CUDA. You should see at the bottom of the printout: TorchVision: 0.14.0+cu117 OpenCV: 4.10.0 MMCV: 1.7.1 MMCV Compiler: GCC 9.3 MMCV CUDA Compiler: 11.7

If not try installing from this whl or build from source with required flags (solved the issue for me): pip install mmcv-full==1.7.0 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13/index.html good luck

Thank you for your help I have to fix this problem because the local_test.sh script default export CUDA_VISIBLE_DEVICES=3, I only have one GPU device, and when I set it to export CUDA_VISIBLE_DEVICES=0, everything works.