PaddlePaddle / PaddleSeg

Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.
https://arxiv.org/abs/2101.06175
Apache License 2.0
8.69k stars 1.68k forks source link

You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default. #964

Closed monkeycc closed 1 year ago

monkeycc commented 3 years ago

export PATH="/usr/local/cuda-11.1/bin:$PATH" export LD_LIBRARY_PATH="/usr/local/cuda-11.1/lib64:$LD_LIBRARY_PATH" export CUDA_HOME=$CUDA_HOME:/usr/local/cuda-11.1 export CUDA_VISIBLE_DEVICES=0

NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: 11.2

-----------Environment Information------------- platform: Linux-5.10.18-amd64-desktop-x86_64-with-debian-10.8 Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0] Paddle compiled with cuda: True NVCC: Build cuda_11.1.TC455_06.29069683_0 cudnn: 8.0 GPUs used: 0 CUDA_VISIBLE_DEVICES: 0 GPU: ['GPU 0: GeForce RTX'] GCC: gcc (Uos 8.3.0.3-3+rebuild) 8.3.0 PaddlePaddle: 2.0.1 OpenCV: 4.5.1


python train.py --config configs/quick_start/bisenet_optic_disc_512x512_1k.yml

/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: np.int is a deprecated alias for the builtin int. To silence this warning, use int by itself. Doing this will not modify any behavior and is safe. When replacing np.int, you may wish to use e.g. np.int64 or np.int32 to specify the precision. If you wish to review your current use, check the release note link for additional information.

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations def convert_to_list(value, n, name, dtype=np.int):

W0414 10:11:45.200657 25661 init.cc:136] Compiled with WITH_GPU, but no GPU found in runtime.

/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/framework.py:299: UserWarning: You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default.

"You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default."


我看框架 Environment Information 输出都是可以的 但就是提示 You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default.

michaelowenliu commented 3 years ago

@monkeycc Hi, please use paddle.utils.run_check() to check the installation.

monkeycc commented 3 years ago

import paddle paddle.utils.run_check()

Running verify PaddlePaddle program ... 2021-04-14 11:02:54,203 - WARNING - You are using GPU version PaddlePaddle, but there is no GPU detected on your machine. Maybe CUDA devices is not set properly. Original Error is (External) Cuda error(999), unknown error. [Advise: Please search for the error code(999) on website( https://docs.nvidia.com/cuda/archive/9.0/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038 ) to get Nvidia's official solution about CUDA Error.] (at /paddle/paddle/fluid/platform/gpu_info.cc:78)

Traceback (most recent call last): File "", line 1, in File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/utils/install_check.py", line 164, in run_check _run_static_single(use_cuda) File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/utils/install_check.py", line 96, in _run_static_single exe.run(startup_prog) File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1110, in run six.reraise(*sys.exc_info()) File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1108, in run return_merged=return_merged) File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1238, in _run_impl use_program_cache=use_program_cache) File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1328, in _run_program [fetch_var_name]) OSError: In user code:

File "", line 1, in

File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/utils/install_check.py", line 164, in run_check _run_static_single(use_cuda) File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/utils/install_check.py", line 90, in _run_static_single input, out, weight = _simple_network() File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/utils/install_check.py", line 36, in _simple_network bias = paddle.create_parameter(shape=[3], dtype="float32") File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/layers/tensor.py", line 137, in create_parameter default_initializer) File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/layer_helper_base.py", line 378, in create_parameter **attr._to_kwargs(with_initializer=True)) File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2984, in create_parameter initializer(param, self) File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/initializer.py", line 568, in call stop_gradient=True) File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3109, in _prepend_op attrs=kwargs.get("attrs", None)) File "/home/hello/anaconda3/envs/PaddleSeg/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2107, in init for frame in traceback.extract_stack():

ExternalError: Cuda error(999), unknown error.

[Advise: Please search for the error code(999) on website( https://docs.nvidia.com/cuda/archive/9.0/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038 ) to get Nvidia's official solution about CUDA Error.] (at /paddle/paddle/fluid/platform/gpu_info.cc:78)

[operator < uniform_random > error]

fanxiaochen-123 commented 3 years ago

@monkeycc Hi,请问您解决这个问题了吗

lfxx commented 3 years ago

2.0.1版本不支持cuda11.1吧,好像只支持到11.0

fanxiaochen-123 commented 3 years ago

嗯,我在官网查看的版本他推荐下载的2.1.2的,我的CUDA是11.2的

fanxiaochen-123 commented 3 years ago

@lfxx 老哥有遇到那个错吗

lfxx commented 3 years ago

@lfxx 老哥有遇到那个错吗

试着跑下paddle.utils.run_check(),看看输出日志,应该还是cuda问题

fanxiaochen-123 commented 3 years ago

@lfxx 老哥,我的内核显示驱动可以装460的,安装不了低版本的驱动,我试着把CUDA11.2降为了CUDA11.0,nvcc -V显示成功,但是依然是报同样的错误。

import paddle W0903 14:54:33.872156 29663 init.cc:141] Compiled with WITH_GPU, but no GPU found in runtime. /home/dfst/venv3/lib/python3.6/site-packages/paddle/fluid/framework.py:301: UserWarning: You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default. "You are using GPU version Paddle, but your CUDA device is not set properly. CPU device will be used by default." paddle.utils.run_check() Running verify PaddlePaddle program ... WARNING:root:You are using GPU version PaddlePaddle, but there is no GPU detected on your machine. Maybe CUDA devices is not set properly. Original Error is (External) Cuda error(999), unknown error. [Advise: Please search for the error code(999) on website( https://docs.nvidia.com/cuda/archive/9.0/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038 ) to get Nvidia's official solution about CUDA Error.] (at /paddle/paddle/fluid/platform/gpu_info.cc:99)

Traceback (most recent call last): File "", line 1, in File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/utils/install_check.py", line 196, in run_check _run_static_single(use_cuda) File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/utils/install_check.py", line 124, in _run_static_single exe.run(startup_prog) File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1110, in run six.reraise(*sys.exc_info()) File "/home/dfst/venv3/lib/python3.6/site-packages/six.py", line 719, in reraise raise value File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1108, in run return_merged=return_merged) File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1239, in _run_impl use_program_cache=use_program_cache) File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1329, in _run_program [fetch_var_name]) OSError: In user code:

File "<stdin>", line 1, in <module>

File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/utils/install_check.py", line 196, in run_check
  _run_static_single(use_cuda)
File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/utils/install_check.py", line 118, in _run_static_single
  input, out, weight = _simple_network()
File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/utils/install_check.py", line 35, in _simple_network
  attr=paddle.ParamAttr(initializer=paddle.nn.initializer.Constant(0.1)))
File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/fluid/layers/tensor.py", line 138, in create_parameter
  default_initializer)
File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/fluid/layer_helper_base.py", line 380, in create_parameter
  **attr._to_kwargs(with_initializer=True))
File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2895, in create_parameter
  initializer(param, self)
File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/fluid/initializer.py", line 166, in __call__
  stop_gradient=True)
File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2942, in append_op
  attrs=kwargs.get("attrs", None))
File "/home/dfst/venv3/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2014, in __init__
  for frame in traceback.extract_stack():

ExternalError:  Cuda error(999), unknown error.
  [Advise: Please search for the error code(999) on website( https://docs.nvidia.com/cuda/archive/9.0/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038 ) to get Nvidia's official solution about CUDA Error.] (at /paddle/paddle/fluid/platform/gpu_info.cc:99)
  [operator < fill_constant > error]
lfxx commented 3 years ago

ExternalError: Cuda error(999), unknown error.

看样子与paddlepaddle无关,cuda核心出问题了,先重启一下看看能不能正常使用,还不行的话尝试下:

sudo rmmod nvidia_uvm
sudo modprobe nvidia_uvm
fanxiaochen-123 commented 3 years ago

@lfxx 老哥,其他的运行代码使用现在的CUDA没有问题,现在我装了10.1的CUDA,驱动还是460的,与10.1CUDA匹配的驱动418依赖包不支持,但是仍然是那个的错误。

fanxiaochen-123 commented 3 years ago

@lfxx 老哥,可以了,sudo modprobe nvidia_uvm,

lfxx commented 3 years ago

@lfxx 老哥,可以了,sudo modprobe nvidia_uvm,

ok

fanxiaochen-123 commented 3 years ago

@lfxx 感谢老哥

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.