tensorflow / tensorflow

An Open Source Machine Learning Framework for Everyone
https://tensorflow.org
Apache License 2.0
185.66k stars 74.18k forks source link

cudaGetDevice() failed. Status: cudaGetErrorString symbol not found. #32381

Closed thulani196 closed 5 years ago

thulani196 commented 5 years ago

After installing Cuda 10.1 and CuDNN, I am getting above error when testing if tensorflow 2.0 can recognize my GPU, I am using a GTX 1060, on Windows 10.

I am trying to run: tf.test.is_gpu_available( cuda_only=False, min_cuda_compute_capability=None )

ymodak commented 5 years ago

TF 2.0 supports cuda 10.0 Please switch to cuda 10.0 and update cuda paths. See software requirements

tensorflow-bot[bot] commented 5 years ago

Are you satisfied with the resolution of your issue? Yes No

assulthoni commented 4 years ago

CUDA 10.1 is still not working?

FlashAJ commented 4 years ago

CUDA 10.1 is still not working?

No, bro.

coffeeshop13 commented 4 years ago

CUDA 10.1 is still not working?

No, bro.

Same for the TF 1.14?

marboe123 commented 4 years ago

I have the same error while running a deep-learning Keras R-script with tensorflow 2.0 using a GTX 1060, on Windows 10.

I don't know my CUDA version.

I did use these steps to install Tensorflow-GPU which was running correct with the Jupyter example:

https://www.thehardwareguy.co.uk/install-tensorflow-gpu

After this: I installed rstudio in the environment. I installed Keras I runned the deep learing Keras R-script.

The resulting error is:


2019-11-12 21:41:52.691280: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_100.dll'; dlerror: cudart64_100.dll not found

2019-11-12 21:41:55.281028: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2019-11-12 21:41:55.305135: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7715
pciBusID: 0000:01:00.0
2019-11-12 21:41:55.305460: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-11-12 21:41:55.306272: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-12 21:41:55.308379: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7715
pciBusID: 0000:01:00.0
2019-11-12 21:41:55.308674: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-11-12 21:41:55.309506: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
 Show Traceback

 Rerun with Debug
 Error in py_call_impl(callable, dots$args, dots$keywords) : 
  InternalError: cudaGetDevice() failed. Status: cudaGetErrorString symbol not found. 
marboe123 commented 4 years ago

If I am correct my CUDA version is 10.020. Pls see below:

__Python Information__
Python Compiler                               : MSC v.1915 64 bit (AMD64)
Python Implementation                         : CPython
Python Version                                : 3.6.9
Python Locale                                 : en_NL cp1252

__LLVM information__
LLVM version                                  : 8.0.0

__CUDA Information__
Found 1 CUDA devices
id 0    b'GeForce GTX 1060 6GB'                              [SUPPORTED]
                      compute capability: 6.1
                           pci device id: 0
                              pci bus id: 1
Summary:
        1/1 devices are supported
CUDA driver version                           : 10020
CUDA libraries:
Finding cublas from <unavailable>
        ERROR: can't locate lib
Finding cusparse from <unavailable>
        ERROR: can't locate lib
Finding cufft from <unavailable>
        ERROR: can't locate lib
Finding curand from <unavailable>
        ERROR: can't locate lib
Finding nvvm from <unavailable>
        ERROR: can't locate lib
Finding libdevice from <unavailable>
        searching for compute_20...     ERROR: can't open libdevice for compute_20
        searching for compute_30...     ERROR: can't open libdevice for compute_30
        searching for compute_35...     ERROR: can't open libdevice for compute_35
        searching for compute_50...     ERROR: can't open libdevice for compute_50
florianbaer commented 4 years ago

Wtf seriously? When is support for cuda 10.1 planned?

JasperGeurtz commented 4 years ago

copying only cudart64_100.dll from a 10.0 installation into the 10.1 bin folder seems to work (as a workaround until 10.1 support is added)

Meerer commented 4 years ago

I have cuda with version 10.0.130 and I get the same error with a Gtx 1070 on Windows 10.

eemberda commented 4 years ago

copying only cudart64_100.dll from a 10.0 installation into the 10.1 bin folder seems to work (as a workaround until 10.1 support is added)

This worked for me too. Then for all other functions that are deprecated, I used tf.compat.v1 or tf.compat.v2

abhinavg86 commented 4 years ago

I have cuda with version 10.0.130 and I get the same error with a Gtx 1070 on Windows 10.

Were you able to fix it? I am facing the same problems, have the same config as yours.

xufangda commented 4 years ago

Same for me, please fix it

Taif85 commented 4 years ago

i have the same error !! did any one fix it?

Jwer-chen commented 4 years ago

I have get the same error too!who can help me?

yptheangel commented 4 years ago

I got it working by using CUDA10.0 instead of CUDA10.1. Note that you can have two CUDA versions at the same time, make sure your CUDA_PATH is pointing to CUDA10.0. twoCUDAatTheSameTime

You can download archived CUDA10.0 from here

you can verify if you GPU is available for training with this code snippet

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

reference here

ghost commented 4 years ago

I got the same problem while running my Keras code in Jupyter Notebook.... then uninstalling all Keras and tensorflow related packages and reinstalling them in a virtual environment, and creating a new kernel for jupyter note book to run this virtual environment helped me to solve this issue

yanz20 commented 4 years ago

I have the same problem, using tensorflow-gpu==2.0 and CUDA10.0, Graphic card is RTX 2070:

print(device_lib.list_local_devices()) 2020-02-14 00:35:15.639841: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: GeForce RTX 2070 major: 7 minor: 5 memoryClockRate(GHz): 1.62 pciBusID: 0000:01:00.0 2020-02-14 00:35:15.643490: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check. 2020-02-14 00:35:15.646311: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 Traceback (most recent call last): File "", line 1, in File "C:\Users\fredd\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\tensorflow_core\python\client\device_lib.py", line 41, in list_local_devices for s in pywrap_tensorflow.list_devices(session_config=session_config) File "C:\Users\fredd\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\tensorflow_core\python\pywrap_tensorflow_internal.py", line 2249, in list_devices return ListDevices() tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: cudaGetErrorString symbol not found.

Can anybody help?