PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.29k stars 5.62k forks source link

同时import paddle和torch报错找不到cudnn #66947

Open lyxxxyl opened 3 months ago

lyxxxyl commented 3 months ago

bug描述 Describe the Bug

发生了和 66669# 一样的问题 同时import torch 和import paddle 时将会报错

Traceback (most recent call last):
  File "D:\Project\PyC\demo.py", line 117, in <module>
    import torch
  File "D:\Python3\lib\site-packages\torch\__init__.py", line 122, in <module>
    raise err
OSError: [WinError 127] 找不到指定的程序。 Error loading "D:\Python3\lib\site-packages\torch\lib\cudnn_adv_train64_8.dll" or one of its dependencies.

应该与 这个问题 是类似的,先import torch或paddle都会报错。 我的环境是CUDA 11.8,cudnn 8.6.0,python 3.8.0 1722564526045 1722564580862

paddle是按照官网命令下载的,之前用的是paddlepaddle 2.6-cuda11.7,与pytorch同时引用会报错gpu占用问题,和 142#,按照里面的方法重新下了paddle python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu118/

其他补充信息 Additional Supplementary Information

No response

lyxxxyl commented 3 months ago

单独使用时两个库都正常,且可以使用gpu加速 但是显示的cudnn版本都与我系统所安装的不同 系统所安装的cudnn为 cudnn-windows-x86_64-8.6.0.163_cuda11-archive 是按照Paddle官网所要求 1722567106991

验证paddle安装没有问题

import paddle
paddle.utils.run_check()

输出为

Running verify PaddlePaddle program ... 
I0802 10:53:56.846289  7704 program_interpreter.cc:243] New Executor is Running.
W0802 10:53:56.846289  7704 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.2, Runtime API Version: 11.8
W0802 10:53:56.846289  7704 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
I0802 10:53:58.990527  7704 interpreter_util.cc:647] Standalone Executor is Used.
PaddlePaddle works well on 1 GPU.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

且使用paddleocr正常

验证torch

import torch
print(torch.cuda.is_available())
print(torch.backends.cudnn.is_available())
print(torch.cuda_version)
print(torch.backends.cudnn.version())

输出为

True
True
11.8
8700

发现两个cudnn版本都不是系统中所装的cudnn 8.6.0,是否真如 604# 中的评论所说一样,该如何解决呢 1722568054351

wangna11BD commented 3 months ago

参考这个issue https://github.com/PaddlePaddle/Paddle/issues/56812, paddle换成paddle-gpu 2.4.0试试呢

lyxxxyl commented 3 months ago

上面发的报错是先import paddle 再 import torch的,先torch 后 paddle报错如下

Traceback (most recent call last):
  File "D:\Project\PyC\demo.py", line 115, in <module>
    import paddle
  File "D:\Python3\lib\site-packages\paddle\__init__.py", line 721, in <module>
    raise err
OSError: [WinError 127] 找不到指定的程序。 Error loading "D:\Python3\lib\site-packages\paddle\..\nvidia\cudnn\bin\cudnn_adv_infer64_8.dll" or one of its dependencies.
lyxxxyl commented 3 months ago

参考这个issue #56812, paddle换成paddle-gpu 2.4.0试试呢

这个办法是可以的,我之前用paddle-gpu 2.4.2-post117和 torch 1.13.0-cu117是可行的 就是会报一堆警告

D:\software\Python38\python.exe D:\code\omniglue\demo16.py 
D:\software\Python38\lib\site-packages\pkg_resources\__init__.py:121: DeprecationWarning: pkg_resources is deprecated as an API
  warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)

D:\software\Python38\lib\site-packages\pkg_resources\__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)

D:\software\Python38\lib\site-packages\pkg_resources\__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)

D:\software\Python38\lib\site-packages\pkg_resources\__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('zope')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)

我看 142# 的方法还以为新版本解决了,想试一下,然后就报了这样的错误

ChaoII commented 3 months ago

遇到了同样的问题

DoiiarX commented 3 months ago

paddle怎么老解决不了这种环境问题?