NVIDIA / tensorflow

An Open Source Machine Learning Framework for Everyone
https://developer.nvidia.com/deep-learning-frameworks
Apache License 2.0
962 stars 144 forks source link

cuda and tensorflow 1.15 are not compatible #89

Open cencenxy opened 1 year ago

cencenxy commented 1 year ago

Please make sure that this is a build/installation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:build_template

System information

Describe the problem

cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

I did install tensorflow 1.15 in the way I installed it, and the version shown is 1.15, but I get this problem after running it.

Provide the exact sequence of commands / steps that you executed before running into the problem

python xxx.py,This problem occurs when I run the code.

Any other info / logs

2023-06-19 12:37:39.983469: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12 Traceback (most recent call last): File "CsiNet_train.py", line 60, in network_output = residual_network(image_tensor, residual_num, encoded_dim) File "CsiNet_train.py", line 43, in residual_network x = Conv2D(2, (3, 3), padding='same', data_format="channels_first")(x) File "/home/cencen/shared/.conda/envs/cc-tf1.15_py3.8/lib/python3.8/site-packages/keras/engine/topology.py", line 603, in call output = self.call(inputs, **kwargs) File "/home/cencen/shared/.conda/envs/cc-tf1.15_py3.8/lib/python3.8/site-packages/keras/layers/convolutional.py", line 158, in call outputs = K.conv2d( File "/home/cencen/shared/.conda/envs/cc-tf1.15_py3.8/lib/python3.8/site-packages/keras/backend/tensorflow_backend.py", line 3180, in conv2d x, tf_data_format = _preprocess_conv2d_input(x, data_format) File "/home/cencen/shared/.conda/envs/cc-tf1.15_py3.8/lib/python3.8/site-packages/keras/backend/tensorflow_backend.py", line 3062, in _preprocess_conv2d_input if not _has_nchw_support(): File "/home/cencen/shared/.conda/envs/cc-tf1.15_py3.8/lib/python3.8/site-packages/keras/backend/tensorflow_backend.py", line 270, in _has_nchw_support gpus_available = len(_get_available_gpus()) > 0 File "/home/cencen/shared/.conda/envs/cc-tf1.15_py3.8/lib/python3.8/site-packages/keras/backend/tensorflow_backend.py", line 256, in _get_available_gpus _LOCAL_DEVICES = get_session().list_devices() File "/home/cencen/shared/.conda/envs/cc-tf1.15_py3.8/lib/python3.8/site-packages/keras/backend/tensorflow_backend.py", line 168, in get_session _SESSION = tf.Session(config=config) File "/home/cencen/.local/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1585, in init super(Session, self).init(target, graph, config=config) File "/home/cencen/.local/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 699, in init self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts) tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

cencenxy commented 1 year ago

GPU model is 3070

benbarsdell commented 1 year ago

You probably need to upgrade your CUDA driver. It says it loaded libcudart.so.12 but you note that you have CUDA 11.4 installed.

cencenxy commented 1 year ago

If I don't have admin rights to the server, can I still change the cuda version?

nluehr commented 1 year ago

Yes, you will need an administrator to update the system's CUDA driver to the latest version (currently 535.54.03). However, as long as the admin maintains the latest driver version, users can install whatever CUDA toolkit/library versions they want in user-space (such as in your virtualenv). So there shouldn't be conflicts between users wanting different drivers.

cencenxy commented 1 year ago

Thank you.

ymabj commented 1 year ago

Which version of cudnn should I install? I am facing the same problem. I am using CUDA 11.6 but the code loaded libcudart.so.12.

Thank you.

nluehr commented 1 year ago

You may need to update LD_LIBRARY_PATH to include your cuda 11.6 installation ahead of the cuda 12 toolkit path.

hellohawaii commented 1 year ago

Can I install a tensorflow 1.15 with CUDA 11 from pip? or Do I have to upgrade the driver?

nluehr commented 1 year ago

The 22.12 wheel was built against CUDA 11.8 per the release notes.

gzt4se commented 1 year ago

maybe you can install nvidia_tensorflow-1.15.4. it works for me in py3.8, cuda11.4 for linux. More can be found here