运行train_net.py训练时,出现以下情况:
2020-04-29 10:03:45,343 maskrcnn_benchmark INFO: Using 1 GPUs
2020-04-29 10:03:45,343 maskrcnn_benchmark INFO: Namespace(config_file='../configs/rrpn/e2e_rrpn_X_101_32x8d_FPN_1x_DOTA.yaml', distributed=False, local_rank=0, opts=[], skip_test=False)
2020-04-29 10:03:45,343 maskrcnn_benchmark INFO: Collecting env info (might take some time)
2020-04-29 10:03:46,259 maskrcnn_benchmark INFO:
PyTorch version: 1.0.0.dev20190328
Is debug build: No
CUDA used to build PyTorch: 10.0.130
OS: Ubuntu 18.04.4 LTS
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: Could not collect
Python version: 3.6
Is CUDA available: No
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce RTX 2080 Ti
Nvidia driver version: 440.44
cuDNN version: Could not collect
TRAIN: ("RRPN_train", )
..............
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch-nightly_1553749776822/work/aten/src/THC/THCGeneral.cpp line=51 error=30 : unknown error
Traceback (most recent call last):
File "train_net.py", line 175, in
main()
File "train_net.py", line 168, in main
model = train(cfg, args.local_rank, args.distributed)
File "train_net.py", line 32, in train
model.to(device)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 384, in to
return self._apply(convert)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 190, in _apply
module._apply(fn)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 190, in _apply
module._apply(fn)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 190, in _apply
module._apply(fn)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 196, in _apply
param.data = fn(param.data)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 382, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/cuda/init.py", line 163, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch-nightly_1553749776822/work/aten/src/THC/THCGeneral.cpp:51
``
是cuda出现了什么问题吗?我是严格按照作者要求安装的10.0的。
运行train_net.py训练时,出现以下情况: 2020-04-29 10:03:45,343 maskrcnn_benchmark INFO: Using 1 GPUs 2020-04-29 10:03:45,343 maskrcnn_benchmark INFO: Namespace(config_file='../configs/rrpn/e2e_rrpn_X_101_32x8d_FPN_1x_DOTA.yaml', distributed=False, local_rank=0, opts=[], skip_test=False) 2020-04-29 10:03:45,343 maskrcnn_benchmark INFO: Collecting env info (might take some time) 2020-04-29 10:03:46,259 maskrcnn_benchmark INFO: PyTorch version: 1.0.0.dev20190328 Is debug build: No CUDA used to build PyTorch: 10.0.130
OS: Ubuntu 18.04.4 LTS GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 CMake version: Could not collect
Python version: 3.6 Is CUDA available: No CUDA runtime version: Could not collect GPU models and configuration: GPU 0: GeForce RTX 2080 Ti GPU 1: GeForce RTX 2080 Ti
Nvidia driver version: 440.44 cuDNN version: Could not collect
Versions of relevant libraries: [pip] numpy==1.18.3 [pip] torch==1.4.0 [pip] torchvision==0.2.1 [conda] mkl 2020.0 166 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge [conda] numpy 1.13.1 py36_nomkl_0 [nomkl] https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free [conda] pytorch 1.4.0 py3.6_cuda10.0.130_cudnn7.6.3_0 pytorch [conda] pytorch-nightly 1.0.0.dev20190328 py3.6_cuda10.0.130_cudnn7.4.2_0 pytorch [conda] scipy 0.19.1 np113py36_nomkl_0 [nomkl] https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free [conda] torchvision 0.2.1 py_2 pytorch Pillow (4.2.1) 2020-04-29 10:03:46,260 maskrcnn_benchmark INFO: Loaded configuration file ../configs/rrpn/e2e_rrpn_X_101_32x8d_FPN_1x_DOTA.yaml 2020-04-29 10:03:46,260 maskrcnn_benchmark INFO: INPUT: MIN_SIZE_TRAIN: (800,) # TODO:关注一下输入图片resize的处理方式;个人感觉这样设置不会resize,不resize效果好一些。 MAX_SIZE_TRAIN: 800 MIN_SIZE_TEST: 800 MAX_SIZE_TEST: 800
PIXEL_STD: [0.225, 0.224, 0.229] # TODO:defaults.py为[1., 1., 1.]
TO_BGR255: False
DATASETS:
TRAIN: ("DOTA_train", )
TRAIN: ("RRPN_train", ) .............. THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch-nightly_1553749776822/work/aten/src/THC/THCGeneral.cpp line=51 error=30 : unknown error Traceback (most recent call last): File "train_net.py", line 175, in
main()
File "train_net.py", line 168, in main
model = train(cfg, args.local_rank, args.distributed)
File "train_net.py", line 32, in train
model.to(device)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 384, in to
return self._apply(convert)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 190, in _apply
module._apply(fn)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 190, in _apply
module._apply(fn)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 190, in _apply
module._apply(fn)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 196, in _apply
param.data = fn(param.data)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 382, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
File "/home/imut-radar/anaconda2/envs/rrpn_pytorch/lib/python3.6/site-packages/torch/cuda/init.py", line 163, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch-nightly_1553749776822/work/aten/src/THC/THCGeneral.cpp:51
``
是cuda出现了什么问题吗?我是严格按照作者要求安装的10.0的。