mind / wheels

Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
886 stars 108 forks source link

GPU version 1.4.1 for Linux doesn't work #22

Open DavidNemeskey opened 6 years ago

DavidNemeskey commented 6 years ago

I installed the wheel for Python 3.6 from here. However, when trying to import tensorflow, I see the error below. Even if I install mkl-dnn by hand from the repo, I get the same error.

$ ipython
Python 3.6.3 |Anaconda, Inc.| (default, Oct 13 2017, 12:02:49) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import tensorflow as tf
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
~/venvs/test/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py in <module>()
     57 
---> 58   from tensorflow.python.pywrap_tensorflow_internal import *
     59   from tensorflow.python.pywrap_tensorflow_internal import __version__

~/venvs/test/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in <module>()
     27             return _mod
---> 28     _pywrap_tensorflow_internal = swig_import_helper()
     29     del swig_import_helper

~/venvs/test/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in swig_import_helper()
     23             try:
---> 24                 _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
     25             finally:

~/venvs/test/lib/python3.6/imp.py in load_module(name, file, filename, details)
    242         else:
--> 243             return load_dynamic(name, filename, file)
    244     elif type_ == PKG_DIRECTORY:

~/venvs/test/lib/python3.6/imp.py in load_dynamic(name, path, file)
    342             name=name, loader=loader, origin=path)
--> 343         return _load(spec)
    344 

ImportError: libmklml_intel.so: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
<ipython-input-1-64156d691fe5> in <module>()
----> 1 import tensorflow as tf

~/venvs/test/lib/python3.6/site-packages/tensorflow/__init__.py in <module>()
     22 
     23 # pylint: disable=wildcard-import
---> 24 from tensorflow.python import *
     25 # pylint: enable=wildcard-import
     26 

~/venvs/test/lib/python3.6/site-packages/tensorflow/python/__init__.py in <module>()
     47 import numpy as np
     48 
---> 49 from tensorflow.python import pywrap_tensorflow
     50 
     51 # Protocol buffers

~/venvs/test/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py in <module>()
     70 for some common reasons and solutions.  Include the entire stack trace
     71 above this error message when asking for help.""" % traceback.format_exc()
---> 72   raise ImportError(msg)
     73 
     74 # pylint: enable=wildcard-import,g-import-not-at-top,unused-import,line-too-long

ImportError: Traceback (most recent call last):
  File "~/venvs/test/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "~/venvs/test/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "~/venvs/test/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "~/venvs/test/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "~/venvs/test/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: libmklml_intel.so: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.
Moulick commented 6 years ago

Specifically check LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib is present in your .bashrc if you use bash or .zshrc of you use zsh

DavidNemeskey commented 6 years ago

Should /usr/local/lib be checked for libraries by default? Anyway, I exported it and still no cookie. :(

Actually, in the virtual env:

$ ldd ./lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so
./lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.23' not found (required by ./lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so)
./lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by ./lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so)
./lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /mnt/permanent/Priv/ndavid/venvs/test/./lib/python3.6/site-packages/tensorflow/python/../libtensorflow_framework.so)
    linux-vdso.so.1 (0x00007ffcab377000)
    /usr/local/lib/libsystat.so (0x00007f087edf1000)
    /usr/local/lib/libproch.so (0x00007f087ebef000)
    libtensorflow_framework.so => /mnt/permanent/Priv/ndavid/venvs/test/./lib/python3.6/site-packages/tensorflow/python/../libtensorflow_framework.so (0x00007f087da0d000)
    libcublas.so.8.0 => /usr/local/cuda/lib64/libcublas.so.8.0 (0x00007f087b05d000)
    libcusolver.so.8.0 => /usr/local/cuda/lib64/libcusolver.so.8.0 (0x00007f0877aed000)
    libmklml_intel.so => /usr/local/lib/libmklml_intel.so (0x00007f086ecaa000)
    libiomp5.so => /usr/local/lib/libiomp5.so (0x00007f086e906000)
    libcudart.so.8.0 => /usr/local/cuda/lib64/libcudart.so.8.0 (0x00007f086e6a0000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f086e39f000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f086e19b000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f086df7e000)
    libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f086dd68000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f086db60000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f086d855000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f086d63f000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f086d294000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f0889bd2000)
    libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007f086c8b0000)
    libcudnn.so.6 => /usr/local/cuda/lib64/libcudnn.so.6 (0x00007f086334e000)
    libcufft.so.8.0 => /usr/local/cuda/lib64/libcufft.so.8.0 (0x00007f085a500000)
    libcurand.so.8.0 => /usr/local/cuda/lib64/libcurand.so.8.0 (0x00007f0856597000)
    libnvidia-fatbinaryloader.so.367.48 => /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.367.48 (0x00007f0856349000)

So it seems that actually, the library was found and linked against. But why the error message then?

danqing commented 6 years ago

What is your OS, and did you follow the instructions in the README when installing mkl?

# If you don't have cmake
sudo apt install cmake

git clone https://github.com/01org/mkl-dnn.git
cd mkl-dnn/scripts && ./prepare_mkl.sh && cd ..
mkdir -p build && cd build && cmake .. && make
sudo make install

echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib' >> ~/.bashrc
DavidNemeskey commented 6 years ago

@danqing I followed all instructions. As for LD_LIBRARY_PATH, see my previous reply.

jkterry1 commented 6 years ago

Hey I'm having this problem too, on plain Ubuntu 18.04.

dzhelonkin commented 6 years ago

I faced the same problem. Executing sudo ldconfig after all instructions fix the problem with package https://github.com/mind/wheels/releases/download/tf1.4.1-gpu/tensorflow-1.4.1-cp35-cp35m-linux_x86_64.whl