BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.13k stars 18.68k forks source link

caffe installation error (cuda and nvidia) ubuntu 14.04 #4791

Closed OzkanCigdem closed 7 years ago

OzkanCigdem commented 8 years ago

Hi all, I am trying to install caffe with gpu supported nvidia driver. I have nvidia GTX860M graphic card (msi GE702Pe Apache Pro computer). When i try 'make runtest', i got the following error (I did Compilation with Make, yet i also got error with cmake one). On my software&Updates, it is written as: NVIDIA Corporation:Unknown (This device is using the recommended driver. It uses NVIDIA binary driver-version 352.63 from nvidia-352(proprietary, supported). I installed cuda as follows:

 1. wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb

 2. sudo dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb

 3. sudo apt-get update

 4. sudo apt-get install cuda

For opencv installation, I used the command: conda install opencv.

I would appreciate your help regarding to the solution of the following error. Thank you in advance.

The error of make runtest command for caffe installation

   *[100%] Built target test.testbin
    Scanning dependencies of target runtest

    modprobe: FATAL: Module nvidia not found.
    Cuda number of devices: 0 
    Current device id: 0

    Current device name: 
    Note: Randomizing tests' orders with a seed of 445 .
    [==========] Running 2021 tests from 267 test cases.
    [----------] Global test environment set-up.
    [----------] 5 tests from EmbedLayerTest/1, where TypeParam = 
caffe::CPUDevice<double>
    [ RUN      ] EmbedLayerTest/1.TestSetUp
    modprobe: FATAL: Module nvidia not found.
    E0930 17:27:42.258497 14590 common.cpp:113] Cannot create Cublas 
handle. Cublas won't be available.
    modprobe: FATAL: Module nvidia not found.
    E0930 17:27:42.263166 14590 common.cpp:120] Cannot create Curand 
generator. Curand won't be available.
    [       OK ] EmbedLayerTest/1.TestSetUp (9 ms)
    [ RUN      ] EmbedLayerTest/1.TestForward
    [       OK ] EmbedLayerTest/1.TestForward (0 ms)
    [ RUN      ] EmbedLayerTest/1.TestForwardWithBias
    [       OK ] EmbedLayerTest/1.TestForwardWithBias (111 ms)
    [ RUN      ] EmbedLayerTest/1.TestGradient
    E0930 17:27:42.374208 14590 common.cpp:140] Curand not available. 
Skipping setting the curand seed.
    [       OK ] EmbedLayerTest/1.TestGradient (7 ms)
    [ RUN      ] EmbedLayerTest/1.TestGradientWithBias
    [       OK ] EmbedLayerTest/1.TestGradientWithBias (11 ms)
    [----------] 5 tests from EmbedLayerTest/1 (138 ms total)
    [----------] 8 tests from SliceLayerTest/3, where TypeParam = 
caffe::GPUDevice<double>
    [ RUN      ] SliceLayerTest/3.TestSetupNum
    F0930 17:27:42.392833 14590 syncedmem.hpp:18] Check failed: error ==
 cudaSuccess (38 vs. 0)  no CUDA-capable device is detected
    *** Check failure stack trace: ***
        @     0x2ac73c075daa  (unknown)
        @     0x2ac73c075ce4  (unknown)
        @     0x2ac73c0756e6  (unknown)
        @     0x2ac73c078687  (unknown)
        @     0x2ac73b01b145  caffe::SyncedMemory::mutable_cpu_data()
        @     0x2ac73b091c48  caffe::Blob<>::Reshape()
        @     0x2ac73b0920a9  caffe::Blob<>::Reshape()
        @     0x2ac73b09213c  caffe::Blob<>::Blob()
        @           0x8f2345  
caffe::SliceLayerTest<>::SliceLayerTest()
        @           0x8f275b  
testing::internal::TestFactoryImpl<>::CreateTest()
        @           0xd3ede3  
testing::internal::HandleExceptionsInMethodIfSupported<>()
        @           0xd35785  testing::TestInfo::Run()
        @           0xd358a5  testing::TestCase::Run()
        @           0xd385f8  
testing::internal::UnitTestImpl::RunAllTests()
        @           0xd38897  testing::UnitTest::Run()
        @           0x86e2ff  main
        @     0x2ac741a75f45  (unknown)
        @           0x874ba2  (unknown)
        @              (nil)  (unknown)

    make[3]:***[src/caffe/test/CMakeFiles/runtest] Aborted (core dumped)

    make[2]:*** [src/caffe/test/CMakeFiles/runtest.dir/all] Error 2

    make[1]:*** [src/caffe/test
tharindu-b-hewage commented 8 years ago

Try uninstalling all cuda and nvidia driver. Then download cuda deb package which includes nvidia driver itself. Then install that... Hope this helps

OzkanCigdem commented 8 years ago

Hi @LorddBaelish Thanks for your reply. I did it but it didnt work. I did the installation as follows: Do i do something wrong?

1. sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
2. sudo apt-get install --no-install-recommends libboost-all-dev
3. wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb
4. sudo dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb
5. sudo apt-get update
6. sudo apt-get install cuda
7. sudo apt-get install libatlas-base-dev
8. wget https://repo.continuum.io/archive/Anaconda2-4.1.1-Linux-x86_64.sh
9. bash Anaconda2-4.1.1-Linux-x86_64.sh
10. export PATH=<path of anaconda>:$PATH              (probably, usr/anaconda2)
11. source ~/.bashrc
12. conda install opencv
13. sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev
14. wget https://github.com/BVLC/caffe/archive/master.zip
15. unzip caffe-master.zip
16. Go to caffe_master/python
17. Run ‘for req in $(cat requirements.txt); do pip install $req; done’
18. Go back to caffe_master directory ( cd ..)
19. cp Makefile.config.example Makefile.config
 # Adjust Makefile.config (for example, if using Anaconda Python, or if cuDNN is desired)
20. make all
21. make test
22. make runtest
tharindu-b-hewage commented 8 years ago

In the build directory open terminal and type "cmake .." . This will give you a list of installed dependancies and some more info. Check whether it detect cuda / nvidia gpu of yours. This seems to be a graphic driver issue i think but not sure.

OzkanCigdem commented 8 years ago

Hello @LorddBaelish , Thanks again for your concern. I did cmake before yet it gave error again in test step. I also believe that the problem is about graphic card. Thank you in advance

~/caffe-master/build$ cmake ..
-- Boost version: 1.54.0
-- Found the following Boost libraries:
--   system
--   thread
--   filesystem
-- Found gflags  (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libgflags.so)
-- Found glog    (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libglog.so)
-- Found PROTOBUF Compiler: /usr/bin/protoc
-- Found lmdb    (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/liblmdb.so)
-- Found LevelDB (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libleveldb.so)
-- Found Snappy  (include: /usr/include, library: /usr/lib/libsnappy.so)
-- CUDA detected: 7.5
-- Automatic GPU detection failed. Building for all known architectures.
-- Added CUDA NVCC flags for: sm_20 sm_21 sm_30 sm_35 sm_50
-- OpenCV found (/usr/share/OpenCV)
-- Found Atlas (include: /usr/include, library: /usr/lib/libatlas.so)
-- NumPy ver. 1.10.4 found (include: /home/ozkan/anaconda2/lib/python2.7/site-packages/numpy/core/include)
-- Boost version: 1.54.0
-- Found the following Boost libraries:
--   python
-- Could NOT find Doxygen (missing:  DOXYGEN_EXECUTABLE) 
-- Could NOT find Git (missing:  GIT_EXECUTABLE) 
-- 
-- ******************* Caffe Configuration Summary *******************
-- General:
--   Version           :   1.0.0-rc3
--   Git               :   unknown
--   System            :   Linux
--   C++ compiler      :   /usr/bin/c++
--   Release CXX flags :   -O3 -DNDEBUG -fPIC -Wall -Wno-sign-compare -Wno-uninitialized
--   Debug CXX flags   :   -g -fPIC -Wall -Wno-sign-compare -Wno-uninitialized
--   Build type        :   Release
-- 
--   BUILD_SHARED_LIBS :   ON
--   BUILD_python      :   ON
--   BUILD_matlab      :   OFF
--   BUILD_docs        :   ON
--   CPU_ONLY          :   OFF
--   USE_OPENCV        :   ON
--   USE_LEVELDB       :   ON
--   USE_LMDB          :   ON
--   ALLOW_LMDB_NOLOCK :   OFF
-- 
-- Dependencies:
--   BLAS              :   Yes (Atlas)
--   Boost             :   Yes (ver. 1.54)
--   glog              :   Yes
--   gflags            :   Yes
--   protobuf          :   Yes (ver. 2.5.0)
--   lmdb              :   Yes (ver. 0.9.10)
--   LevelDB           :   Yes (ver. 1.15)
--   Snappy            :   Yes (ver. 1.1.0)
--   OpenCV            :   Yes (ver. 2.4.8)
--   CUDA              :   Yes (ver. 7.5)
-- 
-- NVIDIA CUDA:
--   Target GPU(s)     :   Auto
--   GPU arch(s)       :   sm_20 sm_21 sm_30 sm_35 sm_50
--   cuDNN             :   Not found
-- 
-- Python:
--   Interpreter       :   /home/ozkan/anaconda2/bin/python2.7 (ver. 2.7.12)
--   Libraries         :   /usr/lib/x86_64-linux-gnu/libpython2.7.so (ver 2.7.6)
--   NumPy             :   /home/ozkan/anaconda2/lib/python2.7/site-packages/numpy/core/include (ver 1.10.4)
-- 
-- Documentaion:
--   Doxygen           :   No
--   config_file       :   
-- 
-- Install:
--   Install path      :   /home/ozkan/caffe-master/install
-- 
-- Configuring done
tharindu-b-hewage commented 8 years ago

So automatic gpu detection fails right.. Find out best driver version for your graphic card in your os supported for caffe. I suggest you to try that driver. I guess u have no problem with other dependancies. To confirm that u could rebuild with cpu only mode in makefile.config and runtest. Are you sure ur os detect ur graphic card?

OzkanCigdem commented 8 years ago

Hi @LorddBaelish I do lspci -vnn | grep '\''[030[02]\]' and i got the following output:

00:02.0 VGA compatible controller [0300]: Intel Corporation 4th Gen Core Processor Integrated graphics Controller [8086:0416] (rev 06) (prog-if 00 [VGA controller])
01:00.0 3D controller [0302]: NVIDIA Corporation GM107M [GeForce GTX 860M] [10de:1392] (rev a2)
tharindu-b-hewage commented 8 years ago

Well maybe system is using your intel graphics? I dont know how to tell it for sure from above massage. If there is a way to disable intel graphics then try it... Using nvidia only. Runtest in that mode (you may want to build again)

shelhamer commented 7 years ago

From https://github.com/BVLC/caffe/blob/master/CONTRIBUTING.md:

Please do not post usage, installation, or modeling questions, or other requests for help to Issues. Use the caffe-users list instead. This helps developers maintain a clear, uncluttered, and efficient view of the state of Caffe.