PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.23k stars 5.58k forks source link

Cudnn error, CUDNN_STATUS_NOT_INITIALIZED #34938

Closed jinxing94 closed 1 year ago

jinxing94 commented 3 years ago

版本信息: python3.6, paddle2.1, cuda 10.2, cudnn8.2, driver 440.33.01 。

paddle.set_device("cpu") CPUPlace emb = paddle.create_parameter([3, 3], dtype="float32") prnit(emb) Traceback (most recent call last): File "", line 1, in NameError: name 'prnit' is not defined print(emb) Parameter containing: Tensor(shape=[3, 3], dtype=float32, place=CPUPlace, stop_gradient=False, [[-0.80931914, -0.00134873, 0.93233860], [-0.05148190, -0.27370077, -0.58787775], [ 0.20014048, -0.99166054, 0.60008991]]) paddle.set_device("gpu") CUDAPlace(0) import numpy as np a = np.zeros([3]) pd_a = paddle.to_tensor(a) print(pd_a) Tensor(shape=[3], dtype=float64, place=CUDAPlace(0), stop_gradient=True, [0., 0., 0.]) emb = paddle.create_parameter([3, 3], dtype="float32") W0816 10:46:37.607583 16306 device_context.cc:404] Please NOTE: device: 0, GPU Compute Capability: 3.5, Driver API Version: 10.2, Runtime API Version: 10.2 W0816 10:46:37.613322 16306 device_context.cc:422] device: 0, cuDNN Version: 8.2. Traceback (most recent call last): File "", line 1, in File "/root/Projects/.envs/kge/lib/python3.6/site-packages/paddle/fluid/layers/tensor.py", line 138, in create_parameter default_initializer) File "/root/Projects/.envs/kge/lib/python3.6/site-packages/paddle/fluid/layer_helper_base.py", line 374, in create_parameter **attr._to_kwargs(with_initializer=True)) File "/root/Projects/.envs/kge/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2895, in create_parameter initializer(param, self) File "/root/Projects/.envs/kge/lib/python3.6/site-packages/paddle/fluid/initializer.py", line 572, in call stop_gradient=True) File "/root/Projects/.envs/kge/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2925, in append_op kwargs.get("stop_gradient", False)) File "/root/Projects/.envs/kge/lib/python3.6/site-packages/paddle/fluid/dygraph/tracer.py", line 45, in trace_op not stop_gradient) OSError: (External) Cudnn error, CUDNN_STATUS_NOT_INITIALIZED (at /paddle/paddle/fluid/platform/device_context.h:371) [operator < uniform_random > error]

paddle-bot-old[bot] commented 3 years ago

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

wangxicoding commented 3 years ago

是不是cudnn没有安装对,可以检查一下cudnn的版本,paddle.device.get_cudnn_version() https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/device/get_cudnn_version_cn.html#get-cudnn-version 如果没有安装对,可以参考一下官网里的安装。 https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html image

jinxing94 commented 3 years ago

paddle.device.get_cudnn_version() 8202, 请问CUDNN v7.6+ 包括cudnn 8.2吗?

jinxing94 commented 3 years ago

跑pytorch是没问题的

jinxing94 commented 3 years ago

类似issue别人也提过,https://github.com/PaddlePaddle/Paddle/issues/34557

wangxicoding commented 3 years ago

paddle.device.get_cudnn_version() 8202, 请问CUDNN v7.6+ 包括cudnn 8.2吗?

应该是包括的

wangxicoding commented 3 years ago

我本地测试没有问题。 image

之前的whl包都是有cuda、cudnn版本号的,最新的都给取消了,不知道是不是和这个有关系。 image 你那边方便升级驱动吗,试试升级一下cuda的驱动到cuda11

jinxing94 commented 3 years ago

pip安装的版本是2.1.2

wangxicoding commented 3 years ago

image 试一下 python -m pip install paddlepaddle-gpu==2.1.2.post102 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html

jinxing94 commented 3 years ago

找不到102, 101试了下,也不行

wangxicoding commented 3 years ago

cudnn版本切成7.6的看看行不行🤣

wangxicoding commented 3 years ago

或者可以直接使用docker,看看行不行,https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/docker/linux-docker.html image

jinxing94 commented 3 years ago

cudnn版本切成7.6可以