Rust-GPU / Rust-CUDA

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
Apache License 2.0
3.16k stars 120 forks source link

Can't install on WSL #91

Open VictorTaelin opened 2 years ago

VictorTaelin commented 2 years ago

I've spent the last several hours trying to install Rust-CUDA on Windows 11 WSL, Ubuntu 20.04, in a Razer Blade 14 notebook, to no success. The install instructions seem to be outdated and/or unclear. I've installed everything following the GUIDE, which was particularly hard because 1. I'm on WSL, 2. it assumes I know what LLVM_CONFIG is, which I don't; 3. it assumes I know what is libnvvm, which I don't. Regardless, after a lot of struggle, I've managed to install nVidia's drivers and CUDA (on Windows side), and CUDA Toolkit (on WSL), following this guide. Right now, when I run cargo run --bin add on Rust-CUDA, I get the following error:

v@MaiaRazerBlade14:~/Rust-CUDA$ cargo run --bin add
   Compiling nvvm v0.1.1 (/home/v/Rust-CUDA/crates/nvvm)
   Compiling cust_raw v0.11.3 (/home/v/Rust-CUDA/crates/cust_raw)
   Compiling curl-sys v0.4.56+curl-7.83.1
   Compiling curl v0.4.44
   Compiling xz2 v0.1.7
error: failed to run custom build command for `nvvm v0.1.1 (/home/v/Rust-CUDA/crates/nvvm)`

Caused by:
  process didn't exit successfully: `/home/v/Rust-CUDA/target/debug/build/nvvm-f71aa668ff49a3a5/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at 'Failed to find CUDA ROOT, make sure the CUDA SDK is installed and CUDA_PATH or CUDA_ROOT are set!', crates/find_cuda_helper/src/lib.rs:198:10
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
error: build failed

After inspecting the code of find_cuda_helper, I've learned it is looking for the following file: /usr/local/cuda/lib/cuda.h. That file is not present. Instead, the contents of /usr/local/cuda-11.8 are:

v@MaiaRazerBlade14:/usr/local/cuda$ ls
cuda-keyring_1.0-1_all.deb  doc  gds  nsight-systems-2022.4.2  targets  version.json

There is a include directory on /usr/local/cuda/targets/x86_64-linux, but that directory does not have a cuda.h file. Instead, cuda.h can be found on /usr/include/cuda.h, but it isn't a cuda directory, just the file isolated. I also have no idea what libnvvm is, but the following file is present: ./usr/lib/x86_64-linux-gnu/libnvvm.so. Perhaps the way I installed the toolkit results in different from what Rust-CUDA expects? I've managed to get the demo working on WSL via Docker, which, in theory, implies I should be able to install it directly, but I'm not sure how to proceed.

VictorTaelin commented 2 years ago

I finally managed to make it run. Sadly I'm too tired to fully document the process, but where is what I did:

  1. Installed Ubuntu 18.04 with wsl --install

  2. Followed the WSL on CUDA guide:

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

For some reason, now the contents of /usr/local/cuda are what I'd expected:

v@MaiaRazerBlade14:~/Rust-CUDA$ cd /usr/local/cuda
v@MaiaRazerBlade14:/usr/local/cuda$ ls
DOCS  EULA.txt  README  bin  compute-sanitizer  doc  extras  gds  include  lib64  libnvvp  nsightee_plugins  nvml  nvvm  share  src  targets  tools  version.json
  1. Installed Rust with the command here, but changed the default options to install the nightly version instead.

  2. Followed the Dockerfile commands to install LLVM, slightly modified:

curl -O https://releases.llvm.org/7.0.1/clang+llvm-7.0.1-x86_64-linux-gnu-ubuntu-18.04.tar.xz &&\
    xz -d clang+llvm-7.0.1-x86_64-linux-gnu-ubuntu-18.04.tar.xz &&\
    tar xf clang+llvm-7.0.1-x86_64-linux-gnu-ubuntu-18.04.tar &&\
    rm clang+llvm-7.0.1-x86_64-linux-gnu-ubuntu-18.04.tar &&\
    mv clang+llvm-7.0.1-x86_64-linux-gnu-ubuntu-18.04 ~/llvm
  1. Edited my ~/.bashrc to export the env variables on the Dockerfile:
export LLVM_CONFIG=/root/llvm/bin/llvm-config
export CUDA_ROOT=/usr/local/cuda
export CUDA_PATH=$CUDA_ROOT
export LLVM_LINK_STATIC=1
export RUST_LOG=info
export PATH=$CUDA_ROOT/nvvm/lib64:/root/.cargo/bin:$PATH
  1. Attempted to do this, also from the Dockerfile, but it didn't fully work. No idea what this is:
    echo $CUDA_ROOT/lib64 >> /etc/ld.so.conf &&\
    echo $CUDA_ROOT/compat >> /etc/ld.so.conf &&\
    echo $CUDA_ROOT/nvvm/lib64 >> /etc/ld.so.conf &&\
    ldconfig

I had to change it in a way that I do not remember to run it with sudo.

  1. Attempted cargo run --bin add again, but it complained about OpenSSL. Googled the command to install it. It then complained about lz, something else I do not remember, and xml. Googled the command to install these. sudo apt-get something.

  2. Attempted cargo run --bin add again and it worked.

I'll leave this thread open in case anyone else finds themselves in a similar situation, but we definitely need more ELI5-ish install instructions. Life as a programmer is hard. It is 4:18am now. Good night.

locadani commented 1 year ago

Hello, I am having a similar problem. So far I did everything on Windows 11, without WSL, and when doing cargo build on the GPU crate it works. The problem comes when doing cargo build on the CPU code. Specifically, I have llvm-config.exe and the whole lib, but when building, rust doesn't find it even if I add the path to the env variables. Can you please tell me what is inside the folders you set using export .... In particular I need LLVM_CONFIG now. Thank you very much