Closed gian1312 closed 5 years ago
tensorflow-gpu
binaries (the one downloaded by pip or conda) are built with cuda 9.0, cudnn 7 since TF 1.5, and cuda 10.0, cudnn 7 since TF 1.13. These are written in the release notes. You have to use the matching version of cuda if using the official binaries.@gian1312 I think it is looking for CUDA10 file. The error is due to mismatch is CUDA version. Best approach is install TF from clean state. Please follow @ppwwyyxx suggestion to select best versions (TF1.12, CUDA9.0 or TF1.13,CUDA10.0) for your need. Please uninstall python and tensorflow and then follow the instructions to install TF fresh. Please let me know how it progresses. Thanks!
identical problem here.
clean installation of Nvidia drivers, CUDA 10.1 and TF
libcublas.so.10.0 error as soon as TF is called.
Ubuntu 18.04.2 LTS; Also Anaconda install of Python 3.7 (is the anaconda install relevant?); 2070
@rhinsall Which TF version you are trying to install? Could you install CUDA10 or correctly reference the CUDA10.1 path in cuDNN. Thanks
It does not seem possible to install Tensorflow with default packaging on Ubuntu 18.04. You have to either build TF from scratch, which requires sourcing an older version of bazel than is available through the default repositories, or manually install specific versions of nvidia drivers and libraries.
None of the linked wheels from upthread are yet built against CUDA 10.1.
Thanks a lot. I relyed on the website and haven't realised, that a new version came out a few days ago. I am sorry. I downgraded to 1.12. Now, my graphic card gets found with the mentioned code.
Sadly, the code (an example from a lecture I attend) which runs on my Windows installation perfectly fine (30 s) takes 6 min on my Linux installation an puts the CPU under load. Is there a work around to force Tensorflow to use the GPU?
@rhinsall Which TF version you are trying to install? Could you install CUDA10 or correctly reference the CUDA10.1 path in cuDNN. Thanks
I'll come home much later and report the exact numbers and paths - but it's a fresh install, downloaded yesterday, CUDA 10.1 per Nvidia's instructions and TF clean install using PIP & Python 3.7
@rhinsall I just found this out myself, not sure if it's common knowledge, but got around this by doing
conda install cudatoolkit
conda install cudnn
I have cuda-10.1 installed on my box, this installed a local conda-only cuda-10.0. Obviously this is to just keep tensorflow working while waiting for better support.
Excellent advice. Immediate rescue. Thank you very much fabricatedmath.
@gian1312 That is strange. There is a guide on using gpu here. Using those instructions you can force TF to use a gpu. Some times it is better to uninstall and reinstall TF. Please let me know how it progresses. If the issue was resolved, please close the ticket. Thanks!
hi,
I am having the similar problem. So , I created new conda environment and installed tensorflow-gpu as
`
conda install tensorflow-gpu
Collecting package metadata: done
Solving environment: done
environment location: /home/lasii/anaconda3/envs/drunk2
added / updated specs:
The following packages will be downloaded:
package | build
---------------------------|-----------------
_tflow_select-2.1.0 | gpu 2 KB defaults
absl-py-0.4.1 | py35_0 144 KB defaults
astor-0.7.1 | py35_0 43 KB defaults
cupti-9.2.148 | 0 1.7 MB defaults
gast-0.2.0 | py35_0 15 KB defaults
grpcio-1.12.1 | py35hdbcaa40_0 1.7 MB defaults
libprotobuf-3.6.0 | hdbcaa40_0 4.1 MB defaults
markdown-2.6.11 | py35_0 104 KB defaults
mkl_fft-1.0.6 | py35h7dd41cf_0 149 KB defaults
mkl_random-1.0.1 | py35h4414c95_1 362 KB defaults
numpy-1.15.2 | py35h1d66e8a_0 47 KB defaults
numpy-base-1.15.2 | py35h81de0dd_0 4.2 MB defaults
protobuf-3.6.0 | py35hf484d3e_0 615 KB defaults
six-1.11.0 | py35_1 21 KB defaults
tensorboard-1.10.0 | py35hf484d3e_0 3.3 MB defaults
tensorflow-1.10.0 |gpu_py35hd9c640d_0 3 KB defaults
tensorflow-base-1.10.0 |gpu_py35had579c0_0 190.6 MB defaults
tensorflow-gpu-1.10.0 | hf154084_0 2 KB defaults
termcolor-1.1.0 | py35_1 7 KB defaults
------------------------------------------------------------
Total: 207.1 MB
The following NEW packages will be INSTALLED:
_tflow_select pkgs/main/linux-64::_tflow_select-2.1.0-gpu absl-py pkgs/main/linux-64::absl-py-0.4.1-py35_0 astor pkgs/main/linux-64::astor-0.7.1-py35_0 blas pkgs/main/linux-64::blas-1.0-mkl cudatoolkit pkgs/main/linux-64::cudatoolkit-9.2-0 cudnn pkgs/main/linux-64::cudnn-7.3.1-cuda9.2_0 cupti pkgs/main/linux-64::cupti-9.2.148-0 gast pkgs/main/linux-64::gast-0.2.0-py35_0 grpcio pkgs/main/linux-64::grpcio-1.12.1-py35hdbcaa40_0 intel-openmp pkgs/main/linux-64::intel-openmp-2019.1-144 libgfortran-ng pkgs/main/linux-64::libgfortran-ng-7.3.0-hdf63c60_0 libprotobuf pkgs/main/linux-64::libprotobuf-3.6.0-hdbcaa40_0 markdown pkgs/main/linux-64::markdown-2.6.11-py35_0 mkl pkgs/main/linux-64::mkl-2018.0.3-1 mkl_fft pkgs/main/linux-64::mkl_fft-1.0.6-py35h7dd41cf_0 mkl_random pkgs/main/linux-64::mkl_random-1.0.1-py35h4414c95_1 numpy pkgs/main/linux-64::numpy-1.15.2-py35h1d66e8a_0 numpy-base pkgs/main/linux-64::numpy-base-1.15.2-py35h81de0dd_0 protobuf pkgs/main/linux-64::protobuf-3.6.0-py35hf484d3e_0 six pkgs/main/linux-64::six-1.11.0-py35_1 tensorboard pkgs/main/linux-64::tensorboard-1.10.0-py35hf484d3e_0 tensorflow pkgs/main/linux-64::tensorflow-1.10.0-gpu_py35hd9c640d_0 tensorflow-base pkgs/main/linux-64::tensorflow-base-1.10.0-gpu_py35had579c0_0 tensorflow-gpu pkgs/main/linux-64::tensorflow-gpu-1.10.0-hf154084_0 termcolor pkgs/main/linux-64::termcolor-1.1.0-py35_1 werkzeug pkgs/main/linux-64::werkzeug-0.14.1-py35_0 ` After installation . I just imported tensorflow and got the error.
`Traceback (most recent call last):
File "/home/lasii/.local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "
Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors `
I just started using github. Guide me if I am posting improperly.
@ivineetm007 , Can you check the CUDA version!
@codexponent It's 9.20 Conda automatically installed it while installing tensorflow-gpu.
I think you should update your CUDA version to 10 along. This link will help you Link: https://www.nvidia.com/Download/index.aspx?lang=en-us
@codexponent
I installed cuda 10.0 in conda by
conda install -c fragcolor cuda10.0
Now , there are two cuda in conda environment package list. cudatoolkit 9.2 cuda 10.0
But the same error occurs on importing tensorflow.
@ivineetm007 , Can you do nvidia-smi and check the head of the table! I am sure that you need to update cuda by downloading the nvidia driver from their website.
@codexponent header NVIDIA-SMI 396.54 Driver Version: 396.54
I am working on a PC in college which is alloted to two or three students. I am not sure if I install cuda by downloading , it will not affect the other environment in conda.
A little history...
I am using code in the link
(https://github.com/DevendraPratapYadav/gsoc18_RedHenLab/tree/master/video_processing_pipeline)
In this link, setup is done on conda . Two weeks ago, tensorflow was [running] perfectly while running the above code.
But someone updated conda in the PC. Now, I am having libculas.so.10.0 error.
@ivineetm007 , if this is not your pc i suggest you don't update it as it might break other environments working for cuda 9. Do one thing, create a new environment, install tensorflow with the specific version number pip install tensorfow==1.10.0 and then test a very simple code like addition of 2 numbers(tf.add). See if this runs or not.
@codexponent I tried your suggestion. It worked fine . Then I tried to install tf-gpu and keras as - conda install -y -c anaconda tensorflow-gpu==1.7.0 conda install -y keras Now I am having error- AttributeError: module 'tensorflow.python.training.checkpointable' has no attribute 'CheckpointableBase' I followed the solution for this error in the link (https://github.com/tensorflow/tensorflow/issues/20499l) which suggested reinstalling. I think some other version of tensorflow-gpu will work
@ivineetm007 , try to do the same thing with opening tf session on the gpu. This link may help Link: https://www.tensorflow.org/guide/using_gpu
Another solution: Don't install anything from conda, just install from pip
Steps:
1) Create a fresh environment
2) pip install tensorflow==1.12.0
3) pip install tensorflow-gpu==1.12.0
4) pip install keras==2.1.3
If you have anything that you want to install from conda, check if it is available on the pip version. If it is not then,
Let's say that your env name is my_env_1
after activating that environment, type which conda
,
if this gives the path to your created environment (...\my_env_1...), then you can install other essential environments. If this gives (..\...), then type pip install conda
, then install other essential environments. (be sure to check again by typing which conda
)
Same problem.My cuda version is 10.1,but the the libcublas.so.10.0 file is not in the catalogue of lib64.I am installing the tensorflow-gpu with the command 'pip install tensorflow-gpu'.
Same problem.My cuda version is 10.1,but the the libcublas.so.10.0 file is not in the catalogue of lib64.I am installing the tensorflow-gpu with the command 'pip install tensorflow-gpu'.
It seems that the libcublas-version is removed by the cuda 10
@lipingbj , did you update the cuda version from conda command or through nvidia official site, I think doing from the actual site might help t get those .so files Link: https://www.nvidia.com/Download/index.aspx?lang=en-us
@lipingbj so i had a similar issue, when pushing an upgrade to a tensorflow code which would call 200 sagemakers in parallel. i solved it by fixing the numpy version to numpy==1.14.5 and tensorflow-gpu to 1.12.0. If you would you like i can paste the dockerfile i created to ensure it works?
Same problem.My cuda version is 10.1,but the the libcublas.so.10.0 file is not in the catalogue of lib64.I am installing the tensorflow-gpu with the command 'pip install tensorflow-gpu'.
It seems that the libcublas-version is removed by the cuda 10
After installing CUDA 10 I have found libcublas.so.10
under /usr/lib/x86_64-linux-gnu/
.
So you need to add /usr/lib/x86_64-linux-gnu/
to your library path by calling:
> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu/
And also since TensorFlow is looking for libcublas.so.10.0
rather than libcublas.so.10
(without the last .0) you need to create a symlink:
ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10 /usr/lib/x86_64-linux-gnu/libcublas.so.10.0
Please look at the instructions here after installing CUDA 10: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#environment-setup
hi all,
I am facilng the same issue, but my problem is little different, i am able to install and import tensorflow-gpu on my local machine as well as when building the docker container, everything is working fine. but when I am building my docker image from Dockerfile and docker-compose-up...build, i am getting this error. Please help me out, I really dont know why this is happening in the building of docker image.
After installing cuda, you need to export $PATH
and $LD_LIBRARY_PATH
. Tensorflow will use these environment variables to load package. For example, if you install cuda at /usr/local/
, you can add this to your .zshrc
or .bashrc
(depend on the shell you using)
CUDA_VERSION=10.0
export PATH=/usr/local/cuda-$CUDA_VERSION/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-$CUDA_VERSION/lib64:$LD_LIBRARY_PATH
This trick can be used to change the version of cuda you want to use.
@mostafaelhoushi I did the simlink but this does not make the trick:
ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10 /usr/lib/x86_64-linux-gnu/libcublas.so.10
root@b55736f184ff:/notebooks# python3.6 -c "import tensorflow as tf; print(tf.__version__);"
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/usr/lib/python3.6/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py", line 24, in <module>
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/usr/lib/python3.6/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.
@dattran2346 This should work with CUDA10 installed already, but if you start from older Docker images, you may have installed
root@b55736f184ff:/notebooks# echo $CUDA_VERSION
9.0.176
@mostafaelhoushi I did the simlink but this does not make the trick:
ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10 /usr/lib/x86_64-linux-gnu/libcublas.so.10 root@b55736f184ff:/notebooks# python3.6 -c "import tensorflow as tf; print(tf.__version__);" Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module> from tensorflow.python.pywrap_tensorflow_internal import * File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module> _pywrap_tensorflow_internal = swig_import_helper() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description) File "/usr/lib/python3.6/imp.py", line 243, in load_module return load_dynamic(name, filename, file) File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic return _load(spec) ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py", line 24, in <module> from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/__init__.py", line 49, in <module> from tensorflow.python import pywrap_tensorflow File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module> raise ImportError(msg) ImportError: Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module> from tensorflow.python.pywrap_tensorflow_internal import * File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module> _pywrap_tensorflow_internal = swig_import_helper() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description) File "/usr/lib/python3.6/imp.py", line 243, in load_module return load_dynamic(name, filename, file) File "/usr/lib/python3.6/imp.py", line 343, in load_dynamic return _load(spec) ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory Failed to load the native TensorFlow runtime.
Did you make sure you installed CUDA10.0? Or which version is installed?
@dattran2346 @mostafaelhoushi Do i need to install cuda and cudnn during the build of the docker image also, like this :
conda install -c fragcolor cuda10.0
@dattran2346 i have exported the paths as you suggested, but still not working...
Nvidia-docker only give us access to the driver, the cudatoolkit
and cudnn
stuff we have to install by ourselves. cudatoolkit
and cudnn
are just dynamic libraries and their location depends on where we installed.
If we install tensorflow-gpu
with pip
(inside conda or not), tensorflow-gpu
will look at $PATH
and $LD_LIBRARY_PATH
to find cudatoolkit
and cudnn
at runtime. In this approach, we need to install cudatoolkit
and cudnn
before hand and do export $PATH
. We also need to ensure the version compatible between cudatoolkit
, cudnn
and tensorflow
. Check version compatibility here
If we install tensorflow-gpu
with conda
, conda
will also install appropriate version of cudatoolkit
and cudnn
. In this approach, we do not need to install cudatoolkit
and cudnn
before hand.
@priyakansal, you may need to conda uninstall cuda10.0
and run conda install tensorflow-gpu
.
@loretoparisi, may be try lower version of tensorflow or use conda to install or even upgrade your cuda version 🤔
Ps 1: for installing cudatoolkit
and cudnn
, I found this guide very useful.
Ps 2: Install cudatoolkit
and cudnn
by runtime
file will install the library in /usr/local/
while install by .deb
file will install in /usr/lib/x86_64-linux-gnu/
. So your $PATH
and $LD_LIBRARY_PATH
need to change accordingly. Install cudatoolkit
and cudnn
by conda will install the library ~/miniconda3/envs/<name>/lib
. And you do not need to export
Ps 3: What if I have installed cudatoolkit
and cudnn
and also install tensorflow-gpu
using conda
. Tensoflow-gpu
will use the libaries install within conda
enviroment.
Hope this help, Correct me if I'm wrong 😄 Cheers
@dattran2346
Thankyou so much for so detailed explanation.
if i am running conda install tensorflow-gpu
, then also it is not working, however, i have not tried is with conda uninstall cuda10.0
. Here, the problem is that i also want to install tensorflow-serving-api-gpu
, which is not available for conda-install, so need to install using pip, but when installing this.. i am getting the same error.
please note that, i am doing all this inside the docker. On my local machine(ubuntu), everything is working fine.
What I did was this https://gist.github.com/loretoparisi/4a096fc3625f60403c8734de9660cbcc
add-apt-repository ppa:jonathonf/python-3.6
apt-get update & apt-get install -y python3.6
curl https://bootstrap.pypa.io/get-pip.py > get-pip.py
python3.6 get-pip.py
pip3 uninstall tensorflow-gpu
pip3.6 install tensorflow-gpu==1.12.0
python3.6 -c "import tensorflow as tf; print(tf.__version__);"
Basically you will get Python3.6, CUDA 9 and TF 1.12.0. We have to remote TF-GPU 1.13.0, and then install TF 1.12.0 GPU.
@loretoparisi
Hi ,
Sorry I am bit new to docker ... so when I am building some image ... either
@dattran2346 i have exported the paths as you suggested, but still not working...
Can you try to search for the missing file libcublas.so.10.0
on your file system. e.g. by using
find / -name "libcublas.so.10.0"
and then when you find the path add it to LD_LIBRARY_PATH
environment variable.
If you can't find it, then you probably need to install the correct version.
@mostafaelhoushi
When i am running this command find / -name "libcublas.so.10.0"
the output is
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/host/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/97cb0c942535cde4622f53bf094251cd1aef1cfc744e8ddda1472ee691f87618/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/var/lib/docker/overlay2/2fb234250d278545f55a004fcd436b4cba5e847c40503b990ffe800f3b440cb5/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/var/lib/docker/overlay2/c704b6be3bc1a5d25119fa46216a4e64f872d8001d8bed6d40930f6420ffb091/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/usr/local/cuda-10.0/lib64/libcublas.so.10.0
@mostafaelhoushi
When i am running this command find / -name "libcublas.so.10.0"
the output is
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/host/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0
/var/lib/docker/overlay2/97cb0c942535cde4622f53bf094251cd1aef1cfc744e8ddda1472ee691f87618/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/var/lib/docker/overlay2/2fb234250d278545f55a004fcd436b4cba5e847c40503b990ffe800f3b440cb5/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/var/lib/docker/overlay2/c704b6be3bc1a5d25119fa46216a4e64f872d8001d8bed6d40930f6420ffb091/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/usr/local/cuda-10.0/lib64/libcublas.so.10.0
@mostafaelhoushi When i am running this command
find / -name "libcublas.so.10.0"
the output is/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0 /var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0 /var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0 /var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0 /var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0 /var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/host/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0 /var/lib/docker/overlay2/97cb0c942535cde4622f53bf094251cd1aef1cfc744e8ddda1472ee691f87618/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0 /var/lib/docker/overlay2/2fb234250d278545f55a004fcd436b4cba5e847c40503b990ffe800f3b440cb5/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0 /var/lib/docker/overlay2/c704b6be3bc1a5d25119fa46216a4e64f872d8001d8bed6d40930f6420ffb091/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0 /usr/local/cuda-10.0/lib64/libcublas.so.10.0
OK. I see libcublas.so.10.0
is found in /usr/local/cuda-10.0/lib64/
.
Try running this command:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.0/lib64/
and try again.
NOTE: I see the library is also found in your docker system. I am not familiar with dockers, so maybe someone else could help here. But try the above command and see.
It happened to me when I installed cuda-10.1 not cuda-10.0 , downgrading to 10.0 did fix it
@littlehome-eugene But I am using cuda-10.0 only Btw, have you done it for docker
Same problem.My cuda version is 10.1,but the the libcublas.so.10.0 file is not in the catalogue of lib64.I am installing the tensorflow-gpu with the command 'pip install tensorflow-gpu'.
It seems that the libcublas-version is removed by the cuda 10
After installing CUDA 10 I have found
libcublas.so.10
under/usr/lib/x86_64-linux-gnu/
. So you need to add/usr/lib/x86_64-linux-gnu/
to your library path by calling:> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu/
And also since TensorFlow is looking for
libcublas.so.10.0
rather thanlibcublas.so.10
(without the last .0) you need to create a symlink:ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10 /usr/lib/x86_64-linux-gnu/libcublas.so.10
There is a typo in the last command, it should be:
ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10 /usr/lib/x86_64-linux-gnu/libcublas.so.10.0
Also consider issuing that command with root privileges (sudo) or you will get a permission denied error...
Same problem.My cuda version is 10.1,but the the libcublas.so.10.0 file is not in the catalogue of lib64.I am installing the tensorflow-gpu with the command 'pip install tensorflow-gpu'.
It seems that the libcublas-version is removed by the cuda 10
After installing CUDA 10 I have found
libcublas.so.10
under/usr/lib/x86_64-linux-gnu/
. So you need to add/usr/lib/x86_64-linux-gnu/
to your library path by calling:> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu/
And also since TensorFlow is looking for
libcublas.so.10.0
rather thanlibcublas.so.10
(without the last .0) you need to create a symlink:ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10 /usr/lib/x86_64-linux-gnu/libcublas.so.10
There is a typo in the last command, it should be:
ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10 /usr/lib/x86_64-linux-gnu/libcublas.so.10.0
Also consider issuing that command with root privileges (sudo) or you will get a permission denied error...
Thanks @plche ! I fixed it
just remove everything about 10.1 and downgrade it to Cuda 10.0 and it will work. Nothing else worked for me.
@mostafaelhoushi When i am running this command
find / -name "libcublas.so.10.0"
the output is/var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0 /var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/tf_serving/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0 /var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server.runfiles/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0 /var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccublas___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib/libcublas.so.10.0 /var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/k8-opt/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0 /var/lib/docker/overlay2/33ff618e94595ffbdc09016439dc6a469fa8adc3ec3b5231f776d6065aab7968/diff/root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving/bazel-out/host/genfiles/external/local_config_cuda/cuda/cuda/lib/libcublas.so.10.0 /var/lib/docker/overlay2/97cb0c942535cde4622f53bf094251cd1aef1cfc744e8ddda1472ee691f87618/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0 /var/lib/docker/overlay2/2fb234250d278545f55a004fcd436b4cba5e847c40503b990ffe800f3b440cb5/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0 /var/lib/docker/overlay2/c704b6be3bc1a5d25119fa46216a4e64f872d8001d8bed6d40930f6420ffb091/diff/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0 /usr/local/cuda-10.0/lib64/libcublas.so.10.0
OK. I see
libcublas.so.10.0
is found in/usr/local/cuda-10.0/lib64/
. Try running this command:export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.0/lib64/
and try again.
NOTE: I see the library is also found in your docker system. I am not familiar with dockers, so maybe someone else could help here. But try the above command and see.
@mostafaelhoushi have given the best solution. Anyone who is confused see this answer. :)
Unfortunately my underlying question is a bit unrelated to this thread - I have the wrong version installed. However I'm hoping there's someone more knowledgeable here that can answer my actual query below.
I'm running arch linux; I installed tensorflow 2:
pip install tensorflow-gpu==2.0.0-alpha0
I had previously been running an older version of the cuda
and cudnn
packages in order to work with tensorflow 1. I removed these and installed the latest in the AUR:
[stiege@archie ~]$ sudo pacman -S cuda cudnn
[sudo] password for stiege:
warning: cuda-10.1.105-6 is up to date -- reinstalling
warning: cudnn-7.5.0.56-1 is up to date -- reinstalling
resolving dependencies...
looking for conflicting packages...
Packages (2) cuda-10.1.105-6 cudnn-7.5.0.56-1
Total Installed Size: 4390.26 MiB
Net Upgrade Size: 0.00 MiB
:: Proceed with installation? [Y/n] Y
Note the cuda version is actually 10.1
; however I get the same error as others in the thread:
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory
But -
[stiege@archie ~]$ ldconfig -p 2>/dev/null | grep libcublas.so
libcublas.so.10 (libc6,x86-64) => /opt/cuda/lib64/libcublas.so.10
libcublas.so (libc6,x86-64) => /opt/cuda/lib64/libcublas.so
I can find nothing about why only these two libcublas.so*
links are created - why is it just for the major version and not the minor and patch versions? Is this by a convention / standard? Links/Docs? I also still can't find these in the "standard place" - which I assumed is what ldconfig
was doing:
[stiege@archie ~]$ find /usr/lib/ -name libcublas.so*
[stiege@archie ~]$ find /lib/ -name libcublas.so*
[stiege@archie ~]$
And this is what makes me concerned about the issue of the actual thread - it appears that even libcublas.so.10.1
isn't even available:
In [38]: l = ctypes.cdll.LoadLibrary("libcublas.so")
In [39]: l = ctypes.cdll.LoadLibrary("libcublas.so.10")
In [40]: l = ctypes.cdll.LoadLibrary("libcublas.so.10.1")
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-40-9eb0347ef2f9> in <module>
----> 1 l = ctypes.cdll.LoadLibrary("libcublas.so.10.1")
[stiege@archie ~]$ cat /etc/ld.so.conf.d/cuda.conf
/opt/cuda/lib64
/opt/cuda/nvvm/lib64
/opt/cuda/extras/CUPTI/lib64
^ Again there are lots of shared objects in these directories; I'm not sure why only the 2 mentioned above end up being processed by ldconfig
; is this basically all by the underlying convention? It seems reasonable to me to ask for a specific minor version, however much of the guidance (I could find at short notice) seems to really push that only the MAJOR version is important - https://unix.stackexchange.com/questions/475/how-do-so-shared-object-numbers-work
Found libcrypto
as a counter-example to the convention I inferred. This links to a major.minor version, the major alone is actually not provided.
In [50]: l = ctypes.cdll.LoadLibrary("libcrypto.so.1.0.0")
In [51]: l = ctypes.cdll.LoadLibrary("libcrypto.so.1.1")
In [52]: l = ctypes.cdll.LoadLibrary("libcrypto.so.1")
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-52-3d67cbbd3826> in <module>
----> 1 l = ctypes.cdll.LoadLibrary("libcrypto.so.1")
But this is exactly what I'd expect from the listing in /lib/
:
[stiege@archie tensorflow]$ find /lib/ -name libcrypto.so*
/lib/libcrypto.so
/lib/libcrypto.so.1.1
/lib/libcrypto.so.1.0.0
So my main question appears to be that even though /opt/cuda/lib64/libcublas.so.10.1
seems to be available and configured via the ldconfig
system, why is it unavailable for import with python.
Weird
[stiege@archie tensorflow]$ sudo cp /opt/cuda/lib64/libcublas.so.10.1 /opt/cuda/lib64/libcublas.so.10.2
[stiege@archie tensorflow]$ ldconfig -v | grep libcublas
ldconfig: Can't unlink /opt/cuda/lib64/libcublas.so.10
libcublas.so.10 -> libcublas.so.10.2 (SKIPPED)
libcublasLt.so.10 -> libcublasLt.so.10.1.0.105
[stiege@archie tensorflow]$ sudo cp /opt/cuda/lib64/libcublas.so.10.1 /opt/cuda/lib64/libcublas.so.11
[stiege@archie tensorflow]$ sudo cp /opt/cuda/lib64/libcublas.so.10.1 /opt/cuda/lib64/libcublas.so.11.2
[stiege@archie tensorflow]$ ldconfig -v | grep libcublas
ldconfig: Can't unlink /opt/cuda/lib64/libcublas.so.10
libcublas.so.10 -> libcublas.so.11.2 (SKIPPED)
libcublasLt.so.10 -> libcublasLt.so.10.1.0.105
I was expecting a new key "libcublas.so.11" to be created, but instead ldconfig
seems to be trying to link 10 to 11.2 - no idea how this works.
I had the same problem, after remove tensorflow 1.13, install 1.12, problem was solved!
pip install tensorflow-gpu==1.12.0
my environment is nvidia-driver-390 cuda9.0
Please make sure that this is a build/installation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:build_template
System information
I was using tensorflow gpu last year. I wanted to set it up again. I got it running on my Windows 10 partition. Now I have tried to set it up again on my Mint partition. I always get the following error. ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory. I thought TF needs cuda 9.0 and not 10.0?
The error occurs if I execute the following code.
import tensorflow as tf sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))