PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
21.66k stars 5.44k forks source link

fluid W0224 Compiled with WITH_GPU, but no GPU found in runtime. #22749

Closed sidneyz139 closed 4 years ago

sidneyz139 commented 4 years ago

C:\Users\SidneyZ>python Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import paddle.fluidW0224 21:53:26.449920 16584 init.cc:127] Compiled with WITH_GPU, but no GPU found in runtime.

zhwesky2010 commented 4 years ago

@sidneyz139 hi,请问是遇到新问题了吗 看昨天的import paddle的DLL load failed的问题解决了。执行下 nvidia-smi 看看

sidneyz139 commented 4 years ago

是的,新问题。昨天的问题解决了。上个问题我已经关闭,非常感谢帮助。 image 看不太懂这个输出的含义

sidneyz139 commented 4 years ago

C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi Tue Feb 25 12:00:43 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 392.58 Driver Version: 392.58 | |-------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 NVS 5400M WDDM | 00000000:01:00.0 N/A | N/A | | N/A 44C P0 N/A / N/A | 54MiB / 1024MiB | N/A Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 Not Supported | +-----------------------------------------------------------------------------+

sidneyz139 commented 4 years ago

我的机器上安装过CUDA 10.2, 后来感觉dll import有问题时,就补充安装了10.0,2个版本都增补了对应cudnn. 在昨天重装python3.7,解决了 dll 问题后。我调整了path的设置,只保留了 10.0 CUDA的路径。 显卡驱动曾经重新安装过,没看出什么报错; 唯一可能的问题是,我的机器是T530,有2个显卡,Intel HD graphics 4000, Nvidia NVS 5400M. paddle 是怎么决定用哪个显卡的?

zhwesky2010 commented 4 years ago

@sidneyz139 可能是因为有多个卡的原因,设置一下环境变量set FLAGS_selected_gpus=0,1 或者 set CUDA_VISIBLE_DEVICES=0,1试试。

zhwesky2010 commented 4 years ago

@sidneyz139 跑的时候设置一下 set GLOG_v=5,再看一下有没更完整的报错信息内容,目前这个"but no GPU found in runtime."有点简略

sidneyz139 commented 4 years ago

@zhouwei25 增加了SET环境变量,就好了。应该是2个图像卡的问题。 非常感谢。

sidneyz139 commented 4 years ago

高兴太早了,还是有问题。

paddle.fluid.install_check.run_check() Running Verify Paddle Program ... 2020-02-25 16:47:51,041-WARNING: You are using GPU version Paddle, But Your CUDA Device is not set properly Original Error is


C++ Call Stacks (More useful to developers):

Windows not support stack backtrace yet.


Error Message Summary:

Error: cudaGetDeviceCount failed in paddle::platform::GetCUDADeviceCountImpl, error code : 35, Please see detail in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038: CUDA driver version is insufficient for CUDA runtime version at (D:\1.7.0\paddle\paddle\fluid\platform\gpu_info.cc:72)

0

image

zhwesky2010 commented 4 years ago

@sidneyz139 1.你这个显卡的驱动没有安装正确,建议通过 GeForce Experience 来自动下载驱动安装。手动安装驱动可能版本出错http://cn.download.nvidia.com/GFE/GFEClient/3.20.2.34/GeForce_Experience_v3.20.2.34.exe 2. 显卡硬件没有装好。大概率是第一个的问题。另外set CUDA_VISIBLE_DEVICES=0,Windows下仅支持单卡

sidneyz139 commented 4 years ago

问个事情:

  1. 如果我的机器上有多个版本的 CUDA,是否可以和平共存?只需要在PATH中,调整顺序来切换希望激活的版本?
  2. 与CUDA10、CUDA9,以及与CPU配合的paddle版本可以共存吗?系统是怎么决定哪个paddle版本会运营呢?
  3. 如果系统有多个图像卡,如何指定哪个
sidneyz139 commented 4 years ago

安装了GeForce Experience后,自动安装的版本是版本392.58,检查CUDA10.0和驱动版本感觉不对应,好像只能支持CUDA9,所以试图回退安装9.0。 CUDA 9.0的前置安装时,安装CUDA9.0+ PATCH1/2/3/4, 自己将驱动的版本又下降到385.54.

我如何判断哪个网卡是0,哪个是1?

sidneyz139 commented 4 years ago

C:\Users\SidneyZ>python Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import paddle import paddly.fluid Traceback (most recent call last): File "", line 1, in ModuleNotFoundError: No module named 'paddly' import paddle.fluid paddle.fluid.install_check.run_check() Running Verify Paddle Program ... W0225 20:29:53.582741 16840 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 21, Driver API Version: 9.0, Runtime API Version: 9.0 W0225 20:29:54.531765 16840 device_context.cc:245] device: 0, cuDNN Version: 7.6. D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\executor.py:782: UserWarning: The following exception is not an EOF exception. "The following exception is not an EOF exception.") Traceback (most recent call last): File "", line 1, in File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\install_check.py", line 124, in run_check test_simple_exe() File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\install_check.py", line 120, in test_simple_exe exe0.run(startup_prog) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\executor.py", line 783, in run six.reraise(*sys.exc_info()) File "D:\Program Files\Python\Python37\lib\site-packages\six.py", line 703, in reraise raise value File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\executor.py", line 778, in run use_program_cache=use_program_cache) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\executor.py", line 831, in _run_impl use_program_cache=use_program_cache) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\executor.py", line 905, in _run_program fetch_var_name) paddle.fluid.core_avx.EnforceNotMet:


C++ Call Stacks (More useful to developers):

Windows not support stack backtrace yet.


Python Call Stacks (More useful to users):

File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\framework.py", line 2594, in _prepend_op attrs=kwargs.get("attrs", None)) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\initializer.py", line 191, in call stop_gradient=True) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\framework.py", line 2476, in create_parameter initializer(param, self) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\layer_helper_base.py", line 353, in create_parameter **attr._to_kwargs(with_initializer=True)) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\dygraph\layers.py", line 113, in create_parameter default_initializer) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\dygraph\nn.py", line 921, in init shape=[output_dim], attr=bias_attr, dtype=dtype, is_bias=True) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\install_check.py", line 38, in init param_attr=ParamAttr(initializer=Constant(value=0.1))) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\install_check.py", line 112, in test_simple_exe simple_layer0 = SimpleLayer(input_size=2) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\install_check.py", line 124, in run_check test_simple_exe() File "", line 1, in


Error Message Summary:

Error: Failed to create Cudnn handle in DeviceContext [Hint: CUDNN_STATUS_ARCH_MISMATCH] at (D:\1.7.0\paddle\paddle\fluid\platform\device_context.cc:283) [operator < fill_constant > error]

sidneyz139 commented 4 years ago

1。 SET CUDA_VISIBLE_DEVICES=1 时,

import paddle.fluid W0225 20:40:02.858937 15284 init.cc:127] Compiled with WITH_GPU, but no GPU found in runtime. paddle.fluid.install_check.run_check() Running Verify Paddle Program ... 2020-02-25 20:40:32,204-WARNING: You are using GPU version Paddle, But Your CUDA Device is not set properly

Error Message Summary:

Error: cudaGetDeviceCount failed in paddle::platform::GetCUDADeviceCountImpl, error code : 38, Please see detail in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038: no CUDA-capable device is detected at (D:\1.7.0\paddle\paddle\fluid\platform\gpu_info.cc:72)

文档中的38号错误,没有说明

  1. SET CUDA_VISIBLE_DEVICES=0时,

    import paddle.fluid 正常 paddle.fluid.install_check.run_check()

    Error Message Summary:

    Error: Failed to create Cudnn handle in DeviceContext [Hint: CUDNN_STATUS_ARCH_MISMATCH] at (D:\1.7.0\paddle\paddle\fluid\platform\device_context.cc:283) [operator < fill_constant > error]

zhwesky2010 commented 4 years ago

@sidneyz139 1.多个CUDA可以同时安装,调整CUDA_PATH 和PATH中的环境变量即可。2. 不同版本的paddle不能共同使用,Paddle是装到python里去的,只能装一个。你可以用 pip install -U paddlepaddle-gpu==1.6.3.post97 -i https://pypi.tuna.tsinghua.edu.cn/simple ,重新安装CUDA9的paddle,会卸载之前的paddle

sidneyz139 commented 4 years ago
  1. 再次运行GeForce Experience, 将驱动恢复成了本机最高版本392.58
  2. 使用你提供的指令,将paddle从1.7.0,重新安装退到了1.6.3,post97 CUDA9.0 安装时,报错 换用了百度的镜像,现象一样; 重新卸载 pip uninstall paddlepaddle-gpu, 再安装,现象一样。 是我下载的文件出问题了?还是我的解压出问题了?

image

C:\Program Files\NVIDIA Corporation\NVSMI>python -m pip install -U paddlepaddle-gpu==1.6.3.post97 -i https://pypi.tuna.tsinghua.edu.cn/simple Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting paddlepaddle-gpu==1.6.3.post97 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/8a/67/7e944c6e5f94894217b8a302b075b1ff8c06f54abc55cdb3de0f2c32325a/paddlepaddle_gpu-1.6.3.post97-cp37-cp37m-win_amd64.whl (281.4 MB) |████████████████████████████████| 281.4 MB 51 kB/s ERROR: Exception: Traceback (most recent call last): File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\cli\base_command.py", line 186, in _main status = self.run(options, args) File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\commands\install.py", line 331, in run resolver.resolve(requirement_set) File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\legacy_resolve.py", line 177, in resolve discovered_reqs.extend(self._resolve_one(requirement_set, req)) File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\legacy_resolve.py", line 333, in _resolve_one abstract_dist = self._get_abstract_dist_for(req_to_install) File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\legacy_resolve.py", line 282, in _get_abstract_dist_for abstract_dist = self.preparer.prepare_linked_requirement(req) File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\operations\prepare.py", line 482, in prepare_linked_requirement hashes=hashes, File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\operations\prepare.py", line 287, in unpack_url hashes=hashes, File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\operations\prepare.py", line 164, in unpack_http_url unpack_file(from_path, location, content_type) File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\utils\unpacking.py", line 252, in unpack_file flatten=not filename.endswith('.whl') File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\utils\unpacking.py", line 139, in unzip_file shutil.copyfileobj(fp, destfp) File "D:\Program Files\Python\Python37\lib\shutil.py", line 79, in copyfileobj buf = fsrc.read(length) File "D:\Program Files\Python\Python37\lib\zipfile.py", line 930, in read data = self._read1(n) File "D:\Program Files\Python\Python37\lib\zipfile.py", line 1006, in _read1 data = self._decompressor.decompress(data, n) zlib.error: Error -3 while decompressing data: invalid block type

zhwesky2010 commented 4 years ago

@sidneyz139 你这个是pip安装过程中就已经报错了吗 你先卸载掉原来的吧

zhwesky2010 commented 4 years ago

@sidneyz139 pip uninstall可以成功卸载吗

sidneyz139 commented 4 years ago

我昨晚用 pip uninstall 将大部分第三方安装包卸载,试图安装paddle,没有成功; 刚刚已经将整个python卸载,并重新安装了。 错误现象还是一样的。

C:\Users\SidneyZ>python --version Python 3.7.6

C:\Users\SidneyZ>pip list Package Version


pip 19.2.3 setuptools 41.2.0 WARNING: You are using pip version 19.2.3, however version 20.0.2 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command.

C:\Users\SidneyZ>python -m ensurepip --upgrade Looking in links: c:\Users\SidneyZ\AppData\Local\Temp\tmp_y95z8c1 Requirement already up-to-date: setuptools in d:\program files\python\python37\lib\site-packages (41.2.0) Requirement already up-to-date: pip in d:\program files\python\python37\lib\site-packages (19.2.3)

C:\Users\SidneyZ>python -c "import platform;print(platform.architecture()[0]);print(platform.machine())" 64bit AMD64

C:\Users\SidneyZ>python -m pip install paddlepaddle-gpu==1.6.3.post97 -i https://mirror.baidu.com/pypi/simple Looking in indexes: https://mirror.baidu.com/pypi/simple Collecting paddlepaddle-gpu==1.6.3.post97 Using cached https://mirror.baidu.com/pypi/packages/8a/67/7e944c6e5f94894217b8a302b075b1ff8c06f54abc55cdb3de0f2c32325a/paddlepaddle_gpu-1.6.3.post97-cp37-cp37m-win_amd64.whl ERROR: Exception: Traceback (most recent call last): File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\cli\base_command.py", line 188, in main status = self.run(options, args) File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\commands\install.py", line 345, in run resolver.resolve(requirement_set) File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\legacy_resolve.py", line 196, in resolve self._resolve_one(requirement_set, req) File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\legacy_resolve.py", line 359, in _resolve_one abstract_dist = self._get_abstract_dist_for(req_to_install) File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\legacy_resolve.py", line 307, in _get_abstract_dist_for self.require_hashes File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\operations\prepare.py", line 199, in prepare_linked_requirement progress_bar=self.progress_bar File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\download.py", line 1064, in unpack_url progress_bar=progress_bar File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\download.py", line 928, in unpack_http_url unpack_file(from_path, location, content_type, link) File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\utils\misc.py", line 735, in unpack_file flatten=not filename.endswith('.whl') File "D:\Program Files\Python\Python37\lib\site-packages\pip_internal\utils\misc.py", line 631, in unzip_file shutil.copyfileobj(fp, destfp) File "D:\Program Files\Python\Python37\lib\shutil.py", line 79, in copyfileobj buf = fsrc.read(length) File "D:\Program Files\Python\Python37\lib\zipfile.py", line 930, in read data = self._read1(n) File "D:\Program Files\Python\Python37\lib\zipfile.py", line 1006, in _read1 data = self._decompressor.decompress(data, n) zlib.error: Error -3 while decompressing data: invalid block type WARNING: You are using pip version 19.2.3, however version 20.0.2 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command.

sidneyz139 commented 4 years ago

从上面的日志中,可以看到Python已经重新安装过了,pip list里面几乎是空的;我也手动删除过Python的目录。 pip的cache目录在python37的目录下吗?还是在windows的什么地方? 我机器的SET 是设置之前设置过CUDA_VISIABLEDEVICE的,现在也删除了,现象还是这个样子。 还有一个可能,我机器上还有之前安装vs2019时,一起安装的python, 我通过windows store给卸载掉了;不过从where python中,还能看到那个目录,目录下面已经没有东西了。这个不会再有影响了吧。

sidneyz139 commented 4 years ago

卸载了VS2019的PYTHON组件,并重启了机器,但where python 依然返回含有MicrosoftWindowsApp 程序的目录。 怎么能够把VS2019安装的Python删除干净呢?

C:\Users\SidneyZ\AppData\Local\Microsoft\WindowsApps\目录 和 系统的TEMP 目录使用管理员权限也不能删除。

C:\Users\SidneyZ\AppData\Local\Microsoft\WindowsApps>where python C:\Users\SidneyZ\AppData\Local\Microsoft\WindowsApps\python.exe D:\Program Files\Python\Python37\python.exe

C:\Users\SidneyZ\AppData\Local\Microsoft\WindowsApps>dir 驱动器 C 中的卷是 Win 10 Pro x64 卷的序列号是 3EB7-3F09

C:\Users\SidneyZ\AppData\Local\Microsoft\WindowsApps 的目录

2020/02/24 21:22

Backup 2020/02/07 14:23 0 GameBarElevatedFT_Alias.exe 2020/02/24 21:22 Microsoft.DesktopAppInstaller_8wekyb3d8bbwe 2020/01/28 00:10 Microsoft.MicrosoftEdge_8wekyb3d8bbwe 2020/02/07 14:23 Microsoft.XboxGamingOverlay_8wekyb3d8bbwe 2020/01/28 00:10 0 MicrosoftEdge.exe 2020/02/24 21:22 0 python.exe 2020/02/24 21:22 0 python3.exe 4 个文件 0 字节 4 个目录 23,194,181,632 可用字节

zhwesky2010 commented 4 years ago

@sidneyz139 不用一定要强行卸载那个,只要Python37的目录在前面,就会自动生效,是顺着PATH往下找的。我电脑也有那个VS的和Python37两个,也没关系

sidneyz139 commented 4 years ago

我感觉是昨天夜里下载的时候,应该是网络问题,我本机CACHE的包有什么问题。所以怎么安装都不对。不过按说如果文件损害,应该在安装过程中被CHECKSUM检查出来,重新下载呀。但实际是每次都是use cache. python -m pip install -U paddlepaddle-gpu==1.6.3.post97 -i https://mirror.baidu.com/pypi/simple 怎么都不行。 我现在用了 python -m pip install -U paddlepaddle-gpu==1.7.0.post97 -i https://mirror.baidu.com/pypi/simple 已经安装成功。 现象回到前天晚上的状态。

sidneyz139 commented 4 years ago

set CUDA_VISIBLE_DEVICES=0,Error: Failed to create Cudnn handle in DeviceContext

C:\Users\SidneyZ>python Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import paddle.fluid paddle.fluid.install_check.run_check() Running Verify Paddle Program ... W0226 21:51:47.010613 15012 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 21, Driver API Version: 9.1, Runtime API Version: 9.0 W0226 21:51:47.411487 15012 device_context.cc:245] device: 0, cuDNN Version: 7.6. D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\executor.py:782: UserWarning: The following exception is not an EOF exception. "The following exception is not an EOF exception.") Traceback (most recent call last): File "", line 1, in File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\install_check.py", line 124, in run_check test_simple_exe() File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\install_check.py", line 120, in test_simple_exe exe0.run(startup_prog) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\executor.py", line 783, in run six.reraise(*sys.exc_info()) File "D:\Program Files\Python\Python37\lib\site-packages\six.py", line 703, in reraise raise value File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\executor.py", line 778, in run use_program_cache=use_program_cache) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\executor.py", line 831, in _run_impl use_program_cache=use_program_cache) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\executor.py", line 905, in _run_program fetch_var_name) paddle.fluid.core_avx.EnforceNotMet:


C++ Call Stacks (More useful to developers):

Windows not support stack backtrace yet.


Python Call Stacks (More useful to users):

File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\framework.py", line 2594, in _prepend_op attrs=kwargs.get("attrs", None)) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\initializer.py", line 191, in call stop_gradient=True) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\framework.py", line 2476, in create_parameter initializer(param, self) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\layer_helper_base.py", line 353, in create_parameter **attr._to_kwargs(with_initializer=True)) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\dygraph\layers.py", line 113, in create_parameter default_initializer) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\dygraph\nn.py", line 921, in init shape=[output_dim], attr=bias_attr, dtype=dtype, is_bias=True) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\install_check.py", line 38, in init param_attr=ParamAttr(initializer=Constant(value=0.1))) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\install_check.py", line 112, in test_simple_exe simple_layer0 = SimpleLayer(input_size=2) File "D:\Program Files\Python\Python37\lib\site-packages\paddle\fluid\install_check.py", line 124, in run_check test_simple_exe() File "", line 1, in


Error Message Summary:

Error: Failed to create Cudnn handle in DeviceContext [Hint: CUDNN_STATUS_ARCH_MISMATCH] at (D:\1.7.0\paddle\paddle\fluid\platform\device_context.cc:283) [operator < fill_constant > error]

sidneyz139 commented 4 years ago

set CUDA_VISIBLE_DEVICES=1 时, W0226 22:01:12.924293 6132 init.cc:127] Compiled with WITH_GPU, but no GPU found in runtime. Error: cudaGetDeviceCount failed in paddle::platform::GetCUDADeviceCountImpl, error code : 38, Please see detail in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038: no CUDA-capable device is detected at (D:\1.7.0\paddle\paddle\fluid\platform\gpu_info.cc:72)

C:\Users\SidneyZ>set CUDA_VISIBLE_DEVICES=1

C:\Users\SidneyZ>set ALLUSERSPROFILE=C:\ProgramData APPDATA=C:\Users\SidneyZ\AppData\Roaming CommonProgramFiles=C:\Program Files\Common Files CommonProgramFiles(x86)=C:\Program Files (x86)\Common Files CommonProgramW6432=C:\Program Files\Common Files COMPUTERNAME=T530-SIDNEY ComSpec=C:\Windows\system32\cmd.exe CUDA_PATH=D:\Program Files\NvidiaGPUToolkit\CUDA\V9.0 CUDA_PATH_V10_0=D:\Program Files\NvidiaGPUToolkit\CUDA\V10.0 CUDA_PATH_V10_2=D:\Program Files\NvidiaGPUToolkit\CUDA\V10.2 CUDA_PATH_V9_0=D:\Program Files\NvidiaGPUToolkit\CUDA\V9.0 CUDA_VISIBLE_DEVICES=1 DriverData=C:\Windows\System32\Drivers\DriverData HOMEDRIVE=C: HOMEPATH=\Users\SidneyZ LOCALAPPDATA=C:\Users\SidneyZ\AppData\Local LOGONSERVER=\T530-SIDNEY MOZ_PLUGIN_PATH=D:\Program Files (x86)\Foxit Software\Foxit Reader\plugins\ NUMBER_OF_PROCESSORS=8 NVCUDASAMPLES10_0_ROOT=D:\Program Files\NvidiaGPUToolkit\CUDA\Sample\V10.0 NVCUDASAMPLES10_2_ROOT=D:\Program Files\NvidiaGPUToolkit\CUDA\Sample\V10.2 NVCUDASAMPLES9_0_ROOT=D:\Program Files\NvidiaGPUToolkit\CUDA\Sample\V9.0 NVCUDASAMPLES_ROOT=D:\Program Files\NvidiaGPUToolkit\CUDA\Sample\V9.0 NVTOOLSEXT_PATH=C:\Program Files\NVIDIA Corporation\NvToolsExt\ OneDrive=C:\Users\SidneyZ\OneDrive OS=Windows_NT Path=D:\Program Files\NvidiaGPUToolkit\CUDA\V9.0\bin;D:\Program Files\NvidiaGPUToolkit\CUDA\V9.0\libnvvp;D:\Program Files\Python\Python37\Scripts\;D:\Program Files\Python\Python37\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;"C:\Windows\System32\WindowsPowerShell\v1.0\;";"C:\Windows\System32\OpenSSH\;";"C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;";C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR;C:\Users\SidneyZ\AppData\Local\Microsoft\WindowsAppsDEL; PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.PY;.PYW PROCESSOR_ARCHITECTURE=AMD64 PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 58 Stepping 9, GenuineIntel PROCESSOR_LEVEL=6 PROCESSOR_REVISION=3a09 ProgramData=C:\ProgramData ProgramFiles=C:\Program Files ProgramFiles(x86)=C:\Program Files (x86) ProgramW6432=C:\Program Files PROMPT=$P$G PSModulePath=C:\Program Files\WindowsPowerShell\Modules;C:\Windows\system32\WindowsPowerShell\v1.0\Modules PUBLIC=C:\Users\Public SESSIONNAME=Console SystemDrive=C: SystemRoot=C:\Windows TEMP=C:\Users\SidneyZ\AppData\Local\Temp TMP=C:\Users\SidneyZ\AppData\Local\Temp USERDOMAIN=T530-SIDNEY USERDOMAIN_ROAMINGPROFILE=T530-SIDNEY USERNAME=SidneyZ USERPROFILE=C:\Users\SidneyZ windir=C:\Windows

C:\Users\SidneyZ>python Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import paddle.fluid W0226 22:01:12.924293 6132 init.cc:127] Compiled with WITH_GPU, but no GPU found in runtime. paddle.fluid.install_check.run_check() Running Verify Paddle Program ... 2020-02-26 22:01:43,185-WARNING: You are using GPU version Paddle, But Your CUDA Device is not set properly Original Error is


C++ Call Stacks (More useful to developers):

Windows not support stack backtrace yet.


Error Message Summary:

Error: cudaGetDeviceCount failed in paddle::platform::GetCUDADeviceCountImpl, error code : 38, Please see detail in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038: no CUDA-capable device is detected at (D:\1.7.0\paddle\paddle\fluid\platform\gpu_info.cc:72)

0

sidneyz139 commented 4 years ago

C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi Wed Feb 26 22:05:49 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 392.58 Driver Version: 392.58 | |-------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 NVS 5400M WDDM | 00000000:01:00.0 N/A | N/A | | N/A 44C P0 N/A / N/A | 54MiB / 1024MiB | N/A Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 Not Supported | +-----------------------------------------------------------------------------+

image

sidneyz139 commented 4 years ago

正常情况下,红圈里面的数字应该是多少?为什么我前面的GPU是0,可后面的那个显卡NVS 5400M是正确的?红色箭头的地方应该是什么样子的?

image

sidneyz139 commented 4 years ago

image

zhwesky2010 commented 4 years ago

@sidneyz139 image 这个是我的,似乎你的显卡不支持cudnn?还是cudnn没装好。换CUDA试试吧

sidneyz139 commented 4 years ago

之前曾经试过CUDA 10.2, 10.0, 最后换到CUDA9.0的。 我把Intel的显卡禁用,驱动卸载了,错误现象还是一样。 另外,我运行了 nvidia-smi -q,一堆NA,不知是否正常。 显卡的显示功能看上去是正常的,NVIDIA 自己的GPU程序好像也是正常的。

还有别的检查 CUDA ,cudnn 的方法吗?

C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi -L GPU 0: NVS 5400M (UUID: GPU-8f8ab9b2-62a3-92ec-d8c9-e5275bb25861)

C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi -q

==============NVSMI LOG==============

Timestamp : Wed Feb 26 22:35:01 2020 Driver Version : 392.58

Attached GPUs : 1 GPU 00000000:01:00.0 Product Name : NVS 5400M Product Brand : GeForce Display Mode : N/A Display Active : N/A Persistence Mode : N/A Accounting Mode : N/A Accounting Mode Buffer Size : N/A Driver Model Current : WDDM Pending : WDDM Serial Number : N/A GPU UUID : GPU-8f8ab9b2-62a3-92ec-d8c9-e5275bb25861 Minor Number : N/A VBIOS Version : 70.08.A8.03.02 MultiGPU Board : N/A Board ID : N/A GPU Part Number : N/A Inforom Version Image Version : N/A OEM Object : N/A ECC Object : N/A Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A GPU Virtualization Mode Virtualization mode : N/A PCI Bus : 0x01 Device : 0x00 Domain : 0x0000 Device Id : 0x0DEF10DE Bus Id : 00000000:01:00.0 Sub System Id : 0x21F517AA GPU Link Info PCIe Generation Max : N/A Current : N/A Link Width Max : N/A Current : N/A Bridge Chip Type : N/A Firmware : N/A Replays since reset : N/A Tx Throughput : N/A Rx Throughput : N/A Fan Speed : N/A Performance State : P0 Clocks Throttle Reasons : N/A FB Memory Usage Total : 1024 MiB Used : 54 MiB Free : 970 MiB BAR1 Memory Usage Total : N/A Used : N/A Free : N/A Compute Mode : Default Utilization Gpu : N/A Memory : N/A Encoder : N/A Decoder : N/A Encoder Stats Active Sessions : N/A Average FPS : N/A Average Latency : N/A Ecc Mode Current : N/A Pending : N/A ECC Errors Volatile Single Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Double Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Aggregate Single Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Double Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Retired Pages Single Bit ECC : N/A Double Bit ECC : N/A Pending : N/A Temperature GPU Current Temp : 48 C GPU Shutdown Temp : N/A GPU Slowdown Temp : N/A GPU Max Operating Temp : N/A Memory Current Temp : N/A Memory Max Operating Temp : N/A Power Readings Power Management : N/A Power Draw : N/A Power Limit : N/A Default Power Limit : N/A Enforced Power Limit : N/A Min Power Limit : N/A Max Power Limit : N/A Clocks Graphics : N/A SM : N/A Memory : N/A Video : N/A Applications Clocks Graphics : N/A Memory : N/A Default Applications Clocks Graphics : N/A Memory : N/A Max Clocks Graphics : N/A SM : N/A Memory : N/A Video : N/A Max Customer Boost Clocks Graphics : N/A Clock Policy Auto Boost : N/A Auto Boost Default : N/A Processes : N/A

sidneyz139 commented 4 years ago

@zhouwei25 你猜对了。 NVidia回复,只支持CUDA 8.0,那是不是Paddle就只能用CPU版了?

image

sidneyz139 commented 4 years ago

已经安装成功了CPU版,不过这个WARNING是什么意思?我需要做什么吗? W0227 11:27:05.060984 16664 fuse_all_reduce_op_pass.cc:74] Find all_reduce operators: 2. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 1.

C:\Users\SidneyZ>python Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import paddle.fluid paddle.fluid.install_check.run_check() Running Verify Paddle Program ... Your Paddle works well on SINGLE GPU or CPU. I0227 11:27:05.059986 16664 parallel_executor.cc:440] The Program will be executed on CPU using ParallelExecutor, 2 cards are used, so 2 programs are executed in parallel. W0227 11:27:05.060984 16664 fuse_all_reduce_op_pass.cc:74] Find all_reduce operators: 2. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 1. I0227 11:27:05.061985 16664 build_strategy.cc:365] SeqOnlyAllReduceOps:0, num_trainers:1 I0227 11:27:05.062989 16664 parallel_executor.cc:307] Inplace strategy is enabled, when build_strategy.enable_inplace = True I0227 11:27:05.062989 16664 parallel_executor.cc:322] Cross op memory reuse strategy is enabled, when build_strategy.memory_optimize = True or garbage collection strategy is disabled, which is not recommended Your Paddle works well on MUTIPLE GPU or CPU. Your Paddle is installed successfully! Let's start deep Learning with Paddle now

zhwesky2010 commented 4 years ago

@sidneyz139 CPU的看来可以正常运行,这是一些OP融合的提示语,不用管。看来是这个显卡因为比较老,只能用CUDA8了,新版的Paddle已经没发布cuda8的pip安装包了。你可以试下pip install paddlepaddle-gpu==1.5.2.post87。是去年9月份最后发布的支持CUDA8的安装包

sidneyz139 commented 4 years ago

收到,非常感谢!

zhwesky2010 commented 4 years ago

@sidneyz139 你好,请问还有其他问题吗?没有的话方便关闭一下issue不。

sidneyz139 commented 4 years ago

没有了,非常感谢帮助!