tmcdonell / cuda

Haskell FFI bindings to CUDA
Other
76 stars 36 forks source link

Ubuntu 16.04 + nvidia-cuda-toolkit "Found CUDA toolkit at: /usr" but 'Could not find path: ["/usr/lib64"]' #58

Closed lestephane closed 4 years ago

lestephane commented 5 years ago

I'd like to avoid installing CUDA using Nvidia's installer if I can help it, as I have a carefully crafted, working, bumblebee setup which I don't want to mess up.

Alas, no luck with the stock nvidia-cuda-toolkit in ubuntu 16.04. Can someone who managed to make it work maybe give a workaround suggestion? While this issue looks similar to #54, I do not have any /usr/lib/cuda directory, so I'm filing it as a separate issue in case it helps someone else.

$ dpkg -l "nvidia-cuda-toolkit" "llvm*" | grep "^ii"
ii  llvm-3.8            1:3.8-2ubuntu1                             amd64        Modular compiler and toolchain technologies
ii  llvm-3.8-dev        1:3.8-2ubuntu1                             amd64        Modular compiler and toolchain technologies, libraries and headers
ii  llvm-3.8-runtime    1:3.8-2ubuntu1                             amd64        Modular compiler and toolchain technologies, IR interpreter
ii  nvidia-cuda-toolkit 7.5.18-0ubuntu1                            amd64        NVIDIA CUDA development toolkit
$ cd accelerate-examples/
$ readlink stack.yaml
stack-8.6.yaml
$ git pull
Already up to date.
$ stack clean 
$ stack build
cuda               > configure
cuda               > [1 of 2] Compiling Main             ( /tmp/stack7469/cuda-0.10.1.0/Setup.hs, /tmp/stack7469/cuda-0.10.1.0/.stack-work/dist/x86_64-linux/Cabal-2.4.0.1/setup/Main.o )
cuda               > [2 of 2] Compiling StackSetupShim   ( /home/lestephane/.stack/setup-exe-src/setup-shim-mPHDZzAJ.hs, /tmp/stack7469/cuda-0.10.1.0/.stack-work/dist/x86_64-linux/Cabal-2.4.0.1/setup/StackSetupShim.o )
cuda               > Linking /tmp/stack7469/cuda-0.10.1.0/.stack-work/dist/x86_64-linux/Cabal-2.4.0.1/setup/setup ...
cuda               > Configuring cuda-0.10.1.0...
cuda               > Found CUDA toolkit at: /usr
cuda               > setup: Could not find path: ["/usr/lib64"]
cuda               > 

--  While building package cuda-0.10.1.0 using:
      /tmp/stack7469/cuda-0.10.1.0/.stack-work/dist/x86_64-linux/Cabal-2.4.0.1/setup/setup --builddir=.stack-work/dist/x86_64-linux/Cabal-2.4.0.1 configure --user --package-db=clear --package-db=global --package-db=/home/lestephane/.stack/snapshots/x86_64-linux/aa272c68412dcda5f988dca48757e9c971bc29607f65b0a3e16436e1989f584d/8.6.5/pkgdb --libdir=/home/lestephane/.stack/snapshots/x86_64-linux/aa272c68412dcda5f988dca48757e9c971bc29607f65b0a3e16436e1989f584d/8.6.5/lib --bindir=/home/lestephane/.stack/snapshots/x86_64-linux/aa272c68412dcda5f988dca48757e9c971bc29607f65b0a3e16436e1989f584d/8.6.5/bin --datadir=/home/lestephane/.stack/snapshots/x86_64-linux/aa272c68412dcda5f988dca48757e9c971bc29607f65b0a3e16436e1989f584d/8.6.5/share --libexecdir=/home/lestephane/.stack/snapshots/x86_64-linux/aa272c68412dcda5f988dca48757e9c971bc29607f65b0a3e16436e1989f584d/8.6.5/libexec --sysconfdir=/home/lestephane/.stack/snapshots/x86_64-linux/aa272c68412dcda5f988dca48757e9c971bc29607f65b0a3e16436e1989f584d/8.6.5/etc --docdir=/home/lestephane/.stack/snapshots/x86_64-linux/aa272c68412dcda5f988dca48757e9c971bc29607f65b0a3e16436e1989f584d/8.6.5/doc/cuda-0.10.1.0 --htmldir=/home/lestephane/.stack/snapshots/x86_64-linux/aa272c68412dcda5f988dca48757e9c971bc29607f65b0a3e16436e1989f584d/8.6.5/doc/cuda-0.10.1.0 --haddockdir=/home/lestephane/.stack/snapshots/x86_64-linux/aa272c68412dcda5f988dca48757e9c971bc29607f65b0a3e16436e1989f584d/8.6.5/doc/cuda-0.10.1.0 --dependency=Cabal=Cabal-2.4.1.0-4t2ut7bCQNuEj8DDES6BZk --dependency=base=base-4.12.0.0 --dependency=bytestring=bytestring-0.10.8.2 --dependency=directory=directory-1.3.3.0 --dependency=filepath=filepath-1.4.2.1 --dependency=pretty=pretty-1.1.3.6 --dependency=template-haskell=template-haskell-2.14.0.0 --dependency=uuid-types=uuid-types-1.0.3-Autqzm2g4auIYSV6nkCRLV --exact-configuration --ghc-option=-fhide-source-paths
    Process exited with code: ExitFailure 1
Progress 1/6
$ dpkg -L nvidia-cuda-toolkit 
/.
/etc
/etc/nvcc.profile
/usr
/usr/bin
/usr/bin/nvdisasm
/usr/bin/nvcc
/usr/bin/nvlink
/usr/bin/bin2c
/usr/bin/filehash
/usr/bin/fatbinary
/usr/bin/cudafe
/usr/bin/cuobjdump
/usr/bin/cudafe++
/usr/bin/cuda-memcheck
/usr/bin/nvprune
/usr/bin/ptxas
/usr/lib
/usr/lib/nvidia-cuda-toolkit
/usr/lib/nvidia-cuda-toolkit/bin
/usr/lib/nvidia-cuda-toolkit/bin/g++
/usr/lib/nvidia-cuda-toolkit/bin/crt
/usr/lib/nvidia-cuda-toolkit/bin/crt/prelink.stub
/usr/lib/nvidia-cuda-toolkit/bin/crt/link.stub
/usr/lib/nvidia-cuda-toolkit/bin/nvcc
/usr/lib/nvidia-cuda-toolkit/bin/gcc
/usr/lib/nvidia-cuda-toolkit/bin/cicc
/usr/lib/nvidia-cuda-toolkit/libdevice
/usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.compute_30.10.bc
/usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.compute_35.10.bc
/usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.compute_20.10.bc
/usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.compute_50.10.bc
/usr/include
/usr/include/nvvm.h
/usr/share
/usr/share/lintian
/usr/share/lintian/overrides
/usr/share/lintian/overrides/nvidia-cuda-toolkit
/usr/share/doc
/usr/share/doc/nvidia-cuda-toolkit
/usr/share/doc/nvidia-cuda-toolkit/copyright
/usr/share/doc/nvidia-cuda-toolkit/README.Debian
/usr/share/man
/usr/share/man/man1
/usr/share/man/man1/cuda-binaries.1.gz
/usr/lib/nvidia-cuda-toolkit/bin/nvcc.profile
/usr/share/doc/nvidia-cuda-toolkit/changelog.Debian.gz
/usr/share/man/man1/cuobjdump.1.gz
/usr/share/man/man1/nvdisasm.1.gz
/usr/share/man/man1/cuda-memcheck.1.gz
/usr/share/man/man1/nvcc.1.gz
/usr/share/man/man1/nvprune.1.gz
tmcdonell commented 5 years ago

Do you know where it is installing the cuda libraries, libcuda.so for example? That would be the first step in figuring out the right paths to add. You might be able to use pkg-config --libs cuda-whatever.

Is this thing to install version 7.5 of the toolkit? While this should work, I will note that this version is extremely old.

lestephane commented 5 years ago

Here is the location of libcuda.so

$ locate libcuda.so | xargs dpkg -S
libcuda1-384: /usr/lib/i386-linux-gnu/libcuda.so
libcuda1-384: /usr/lib/i386-linux-gnu/libcuda.so.1
libcuda1-384: /usr/lib/i386-linux-gnu/libcuda.so.384.130
libcuda1-384: /usr/lib/x86_64-linux-gnu/libcuda.so
libcuda1-384: /usr/lib/x86_64-linux-gnu/libcuda.so.1
libcuda1-384: /usr/lib/x86_64-linux-gnu/libcuda.so.384.130

And where that comes from

$ dpkg -l libcuda1-384 
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                  Version                 Architecture            Description
+++-=====================================-=======================-=======================-===============================================================================
ii  libcuda1-384                          384.130-0ubuntu0.16.04. amd64                   NVIDIA CUDA runtime library
$ apt-cache madison libcuda1-384
libcuda1-384 | 384.130-0ubuntu0.16.04.1 | http://security.ubuntu.com/ubuntu xenial-security/restricted amd64 Packages
libcuda1-384 | 384.130-0ubuntu0.16.04.1 | http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu xenial/main amd64 Packages

I'm just scared to use an installer outside of the apt machinery. The NVidia installer will modify all sorts of things, and I only have control over /etc through etckeeper.

tmcdonell commented 5 years ago

It is possible to get CUDA from NVIDIA via the apt system, but you do have to add it as an external repository source first: https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=debnetwork

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-ubuntu1604.pin
sudo mv cuda-ubuntu1604.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/ /"
sudo apt-get update
sudo apt-get -y install cuda

I'm not sure if this would interfere with your bumblebee setup in any way. I don't mind supporting your setup (the package directly from Ubuntu, not NVIDIA) if I can; we just need to figure out the correct paths where everything is located.

lestephane commented 5 years ago

I'll add the apt source you gave me, install cuda that way, and try the detection again. I'm more comfortable working with the apt package manager than bypassing through an installer.

trevortknguyen commented 4 years ago

I'm just going to say I think the problem might be with the library looking in lib64. Nixpkgs installs CUDA by downloading Nvidia's .run file and installing it to a specific directory. Evidently, the lib64 directory is not there.

Found CUDA toolkit at:
/nix/store/zpy06m2mgd47zzwy4cr7d75ajx3p3614-cudatoolkit-10.1.243
setup: Could not find path:
["/nix/store/zpy06m2mgd47zzwy4cr7d75ajx3p3614-cudatoolkit-10.1.243/lib64"]

cabal: Failed to build cuda-0.10.1.0 (which is required by
accelerate-llvm-ptx-1.2.0.1). See the build log above for details.
trevortknguyen commented 4 years ago

The issue is partially documented here. https://github.com/NixOS/nixpkgs/issues/6562

lestephane commented 4 years ago

I'm no longer working on fixing this issue on my side.