Ceruleanacg / Personae

📈 Personae is a repo of implements and environment of Deep Reinforcement Learning & Supervised Learning for Quantitative Trading.
MIT License
1.34k stars 338 forks source link

ImportError: /usr/lib/x86_64-linux-gnu/libcuda.so.1: file too short #14

Closed JS00000 closed 6 years ago

JS00000 commented 6 years ago

I use Docker to run this project, but meet some problems.

The progress is below:

docker image build -t ppdemo . docker run --name my_mongo -p 27017:27017 -d mongo docker run -t --link my_mongo:mongo -v $PWD:/app/Personae ppdemo spider/stock_spider.py

It is fine now. But some errors happened when python import tensorflow.

$ docker run -t --link my_mongo:mongo -v $PWD:/app/Personae ppdemo algorithm/SL/DualAttnRNN.py
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: /usr/lib/x86_64-linux-gnu/libcuda.so.1: file too short

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "algorithm/SL/DualAttnRNN.py", line 3, in <module>
    import tensorflow as tf
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import *
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 72, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: /usr/lib/x86_64-linux-gnu/libcuda.so.1: file too short

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

I am a noob in using docker. May you tell me what's wrong with the progress?

Ceruleanacg commented 6 years ago

It seems like your CUDA version or cuDNN version is not correct, can you tell me what version are they?

JS00000 commented 6 years ago

I did nothing with CUDA or cuDNN in Docker. Just build the docker image using this project's Dockerfile.

Outside the Docker, I installed CUDA8.0 && CUDA9.1 in macOS, and CUDA9.1 is linked.

$ ls /Developer/NVIDIA
CUDA-8.0 CUDA-9.1
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Tue_Dec_19_21:36:29_CST_2017
Cuda compilation tools, release 9.1, V9.1.128

And my cuDNN's version seems like is 5, but it is not work with CUDA9.1 now.

$ ls /usr/local/cuda/lib | grep cudnn
libcudnn.5.dylib
libcudnn.dylib
libcudnn_static.a
$ cat /usr/local/cuda/include/cudnn.h
cat: /usr/local/cuda/include/cudnn.h: No such file or directory

Should I install the right version of CUDA and cuDNN outside the Docker? Or install the CUDA8.0 and cuDNN6 in Docker manually? Thank you very much for explaining my confusion.

Ceruleanacg commented 6 years ago

It's recommended to use docker on Ubuntu with NVIDIA GPU.

wangzhangup commented 6 years ago

@JS00000 You should use Nvidia-docker2.

Yangget commented 5 years ago

nvidia-docker run -dit -p 8888:8888 -p 6006:6006 tensorflow/newbuild:2.0 /bin/bash like me .you should use nvidia-docker to run