facebookarchive / caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework.
https://caffe2.ai
Apache License 2.0
8.42k stars 1.95k forks source link

Installed by anaconda but does not have GPU support #2264

Open xqcn opened 6 years ago

xqcn commented 6 years ago

I follow this step but it shows the caffe2 does not have GPU support. And I check the list of python package and find the name of caffe2 package is caffe2-cuda8.0-cudnn7.

System information

(caffe2) zyy@zyy-All-Series:~/workspace/cxq/anaconda2/bin$ python Python 2.7.13 |Continuum Analytics, Inc.| (default, Dec 20 2016, 23:09:15) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://anaconda.org >>> from caffe2.python import core WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode. WARNING:root:Debug message: libnccl.so.2: cannot open shared object file: No such file or directory CRITICAL:root:Cannot load caffe2.python. Error: /home/zyy/workspace/cxq/anaconda2/envs/caffe2/lib/python2.7/site-packages/caffe2/python/.so: undefined symbol: _ZNK6google8protobuf7Message11GetTypeNameB5cxx11Ev

(caffe2) zyy@zyy-All-Series:~/workspace/cxq/anaconda2/bin$ conda list packages in environment at /home/zyy/workspace/cxq/anaconda2/envs/caffe2: # backports 1.0 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free backports_abc 0.5 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free bzip2 1.0.6 3 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free caffe2-cuda8.0-cudnn7 0.8.dev py27hf88ba63_0 file:///home/zyy/workspace/cxq cairo 1.14.12 h77bcde2_0 defaults certifi 2016.2.28 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free decorator 4.1.2 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free enum34 1.1.6 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free ffmpeg 3.4 h7264315_0 defaults fontconfig 2.12.4 h88586e7_1 defaults freetype 2.8 hab7d2ae_1 defaults future 0.16.0 py27_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free get_terminal_size 1.0.0 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free gflags 2.2.1 hf484d3e_0 defaults glib 2.53.6 h5d9569c_2 defaults glog 0.3.5 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free graphite2 1.3.10 hf63cedd_1 defaults harfbuzz 1.7.4 hc5b324e_0 defaults hdf5 1.10.1 h9caa474_1 defaults icu 58.2 h9c2bf20_1 defaults ipykernel 4.6.1 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free ipython 5.3.0 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free ipython_genutils 0.2.0 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free jasper 1.900.1 hd497a04_4 defaults jpeg 9b 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free jupyter_client 5.1.0 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free jupyter_core 4.3.0 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free libffi 3.2.1 1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free libgcc-ng 7.2.0 hdf63c60_3 defaults libgfortran-ng 7.2.0 hdf63c60_3 defaults libiconv 1.14 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free libopus 1.2.1 hb9ed12e_0 defaults libpng 1.6.34 hb9fc6fc_0 defaults libprotobuf 3.4.0 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free libsodium 1.0.10 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free libstdcxx-ng 7.2.0 hdf63c60_3 defaults libtiff 4.0.9 h28f6b97_0 defaults libvpx 1.6.1 h888fd40_0 defaults libxcb 1.12 1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free libxml2 2.9.4 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free lmdb 0.9.21 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free mkl 2017.0.3 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free numpy 1.13.1 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free opencv 3.3.1 py27hdcf4849_0 defaults openssl 1.0.2l 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free path.py 10.3.1 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free pathlib2 2.3.0 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free pcre 8.39 1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free pexpect 4.2.1 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free pickleshare 0.7.4 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free pip 9.0.1 py27_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free pixman 0.34.0 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free prompt_toolkit 1.0.15 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free protobuf 3.4.0 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free ptyprocess 0.5.2 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free pygments 2.2.0 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free python 2.7.13 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free python-dateutil 2.6.1 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free pyzmq 16.0.2 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free readline 6.2 2 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free scandir 1.5 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free setuptools 36.4.0 py27_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free simplegeneric 0.8.1 py27_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free singledispatch 3.4.0.3 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free six 1.10.0 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free sqlite 3.13.0 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free ssl_match_hostname 3.5.0.1 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free tk 8.5.18 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free tornado 4.5.2 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free traitlets 4.3.2 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free wcwidth 0.1.7 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free wheel 0.29.0 py27_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free xz 5.2.3 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free zeromq 4.1.5 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free zlib 1.2.11 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free

pjh5 commented 6 years ago

You have two problems

  1. CUDA support requires NCCL, and this package requires NCCL 2.1 exactly. Please see Nvidia's website http://docs.nvidia.com/deeplearning/sdk/nccl-install-guide/index.html for instructions on how to install.
  2. These packages are built against gcc 5.0.4, which means they are not compatible with libraries built against gcc < 5 . The default gcc on Ubuntu 14.04 is 4.8.5 . You can check your gcc version with gcc --version . If it is 4.8.5, then you must a. Install a more recent gcc. This is a little dangerous. b. Wait for us to build gcc4 specific packages (no timeline on this at the moment) c. Build from source. i. If you still want to use Anaconda, then follow https://caffe2.ai/docs/getting-started.html?platform=mac&configuration=compile#anaconda-install-path . You will have to call conda build conda/cuda -c conda-forge instead of conda build conda/no_cuda ii. This https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile#anaconda-install-path should work too
xqcn commented 6 years ago

@pjh5 Thank you for your help! And I want to know if I build from source ,then do I need to upgrade NCCL and gcc? Because the server I use is shared with others and I fare I will impact their environment such as caffe and pytorch.

pjh5 commented 6 years ago

You will not need to upgrade packages if you build from source.

kaisark commented 6 years ago

@DradonFlying I installed NCCL v2.1.15, for CUDA 8.0 and that seemed to fix the issue.


(caffe2) ubuntu\@ip-172-31-0-76:\~/Downloads$ sudo dpkg -i nccl-repo-ubuntu1604-2.1.15-ga-cuda8.0_1-1_amd64.deb Selecting previously unselected package nccl-repo-ubuntu1604-2.1.15-ga-cuda8.0. (Reading database ... 320057 files and directories currently installed.) Preparing to unpack nccl-repo-ubuntu1604-2.1.15-ga-cuda8.0_1-1_amd64.deb ... Unpacking nccl-repo-ubuntu1604-2.1.15-ga-cuda8.0 (1-1) ... Setting up nccl-repo-ubuntu1604-2.1.15-ga-cuda8.0 (1-1) ... (caffe2) ubuntu@ip-172-31-0-76:~/Downloads$ sudo apt-get install libnccl2=2.1.15-1+cuda8.0 libnccl-dev=2.1.15-1+cuda8.0 Reading package lists... Done Building dependency tree
Reading state information... Done E: Unable to locate package libnccl2 E: Unable to locate package libnccl-dev (caffe2) ubuntu@ip-172-31-0-76:\~/Downloads$ sudo dpkg -i nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb (Reading database ... 320067 files and directories currently installed.) Preparing to unpack nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb ... Unpacking nvidia-machine-learning-repo-ubuntu1604 (1.0.0-1) over (1.0.0-1) ... Setting up nvidia-machine-learning-repo-ubuntu1604 (1.0.0-1) ... OK

(caffe2) ubuntu@ip-172-31-0-76:\~/Downloads$ sudo apt update Get:1 file:/var/nccl-repo-2.1.15-ga-cuda8.0 InRelease Ign:1 file:/var/nccl-repo-2.1.15-ga-cuda8.0 InRelease Get:2 file:/var/nccl-repo-2.1.15-ga-cuda8.0 Release [574 B] Hit:3 http://us-east-1.ec2.archive.ubuntu.com/ubuntu xenial InRelease Get:2 file:/var/nccl-repo-2.1.15-ga-cuda8.0 Release [574 B]
Get:4 http://us-east-1.ec2.archive.ubuntu.com/ubuntu xenial-updates InRelease [109 kB]
Get:5 file:/var/nccl-repo-2.1.15-ga-cuda8.0 Release.gpg [801 B]
Get:5 file:/var/nccl-repo-2.1.15-ga-cuda8.0 Release.gpg [801 B]
Get:6 http://us-east-1.ec2.archive.ubuntu.com/ubuntu xenial-backports InRelease [107 kB]
Ign:7 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 InRelease
Get:8 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 Release [564 B] Get:9 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 Release.gpg [801 B] Get:10 file:/var/nccl-repo-2.1.15-ga-cuda8.0 Packages [937 B]
Get:11 http://security.ubuntu.com/ubuntu xenial-security InRelease [107 kB] Get:12 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 Packages [20.8 kB] Get:13 https://packages.microsoft.com/repos/vscode stable InRelease [2802 B] Get:14 https://packages.microsoft.com/repos/vscode stable/main amd64 Packages [49.5 kB] Fetched 397 kB in 0s (822 kB/s)
Reading package lists... Done Building dependency tree
Reading state information... Done 168 packages can be upgraded. Run 'apt list --upgradable' to see them.


(caffe2) ubuntu@ip-172-31-0-76:\~/Downloads$ sudo apt-get install libnccl2=2.1.15-1+cuda8.0 libnccl-dev=2.1.15-1+cuda8.0 Reading package lists... Done Building dependency tree
Reading state information... Done The following packages were automatically installed and are no longer required: linux-aws-headers-4.4.0-1022 linux-aws-headers-4.4.0-1032 linux-aws-headers-4.4.0-1035 linux-aws-headers-4.4.0-1038 linux-aws-headers-4.4.0-1039 linux-aws-headers-4.4.0-1044 linux-headers-4.4.0-1022-aws linux-headers-4.4.0-1032-aws linux-headers-4.4.0-1035-aws linux-headers-4.4.0-1038-aws linux-headers-4.4.0-1039-aws linux-headers-4.4.0-1044-aws linux-headers-4.4.0-116 linux-headers-4.4.0-116-generic linux-image-4.4.0-1022-aws linux-image-4.4.0-1032-aws linux-image-4.4.0-1035-aws linux-image-4.4.0-1038-aws linux-image-4.4.0-1039-aws linux-image-4.4.0-1044-aws linux-image-4.4.0-116-generic linux-image-extra-4.4.0-116-generic Use 'sudo apt autoremove' to remove them. The following NEW packages will be installed: libnccl-dev libnccl2 0 upgraded, 2 newly installed, 0 to remove and 166 not upgraded. Need to get 0 B/20.7 MB of archives. After this operation, 339 MB of additional disk space will be used. Get:1 file:/var/nccl-repo-2.1.15-ga-cuda8.0 libnccl2 2.1.15-1+cuda8.0 [10.4 MB] Get:2 file:/var/nccl-repo-2.1.15-ga-cuda8.0 libnccl-dev 2.1.15-1+cuda8.0 [10.3 MB] Selecting previously unselected package libnccl2. (Reading database ... 359613 files and directories currently installed.) Preparing to unpack .../libnccl2_2.1.15-1+cuda8.0_amd64.deb ... Unpacking libnccl2 (2.1.15-1+cuda8.0) ... Selecting previously unselected package libnccl-dev. Preparing to unpack .../libnccl-dev_2.1.15-1+cuda8.0_amd64.deb ... Unpacking libnccl-dev (2.1.15-1+cuda8.0) ... Processing triggers for libc-bin (2.23-0ubuntu10) ... Setting up libnccl2 (2.1.15-1+cuda8.0) ... Setting up libnccl-dev (2.1.15-1+cuda8.0) ... Processing triggers for libc-bin (2.23-0ubuntu10) ...

(caffe2) ubuntu@ip-172-31-0-76:\~/Downloads$ python -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())' 1