Open kayween opened 1 year ago
Hi @kayween ,
Thanks for your interest in this library!
As detailed in this error message, it is likely that setting the environment variable CUDA_PATH
to the location of the folder that contains your CUDA installation will solve your issue.
Indeed, KeOps will look for the CUDA header files at locations $CUDA_PATH/include/cuda.h
and $CUDA_PATH/include/nvrtc.h
.
For reference, our main Docker image is based on Ubuntu and documents how to install CUDA from the official nvidia channel, etc. You may find it helpful.
What do you think?
P.S.: I do not know why the full error description appear on some configurations and not on others (such as yours). This is certainly something that we should fix.
Hi @jeanfeydy ,
My cuda is installed in /usr/local/cuda
:
~$ ls /usr/local/cuda/include/cuda.h
/usr/local/cuda/include/cuda.h
~$ ls /usr/local/cuda/include/nvrtc.h
/usr/local/cuda/include/nvrtc.h
The cuda path is configured correctly as '/usr/local/cuda', but keops still has trouble finding NVRTC.
Python 3.9.17 (main, Jul 5 2023, 20:41:20)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.environ['CUDA_PATH']
'/usr/local/cuda'
>>> import pykeops
[KeOps] Compiling cuda jit compiler engine ... /usr/bin/ld: cannot find -lnvrtc
collect2: error: ld returned 1 exit status
Hi @kayween ,
I see. Since the linker (ld) seems to be the issue, could you try to also add your CUDA folder to the LD_LIBRARY_PATH
environment variable? Assuming that
ls /usr/local/cuda/lib | grep nvrtc
returns a non-empty output that contains something like libnvrtc.so
, the following command should work:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib
Hi @jeanfeydy ,
Yeah I have tried that.
The libnvrtc.so
file is in the folder /usr/local/cuda/lib64
$ ls /usr/local/cuda/lib64/ | grep nvrtc
libnvrtc-builtins.so
libnvrtc-builtins.so.11.6
libnvrtc-builtins.so.11.6.112
libnvrtc-builtins_static.a
libnvrtc.so
libnvrtc.so.11.2
libnvrtc.so.11.6.112
libnvrtc_static.a
The path has already been added
$ echo $LD_LIBRARY_PATH
/usr/local/cuda/lib64:/usr/local/cuda:/usr/local/cuda/lib64:/usr/local/cuda:/usr/local/cuda/lib64:/usr/local/cuda:
But the linker still cannot find the .so
file.
The weird thing is that I can compile and run the example on the NVRTC's documentation. So I guess the CUDA path has been configured correctly, but somehow keops cannot find the correct CUDA path.
Hi @jeanfeydy ,
The issue seems to be the python version.
I realized that the dockerfile uses python 3.8, which seems to be crucial.
I have been using python 3.9. Downgrading to python 3.8 resolves the issue.
Hi @kayween
In the conda env, if just install cudatoolkit, then it does not provide you the cuda.h, nvrtc.h, etc. So one option is to additionally install cudatoolkit-dev from the conda-forge channel, and followed by export CUDA_PATH=/home/kaiwen/anaconda3/envs/altproj
in your conda env.
After installing cudatoolkit-dev, you will notice that required .h files are present under /home/kaiwen/anaconda3/envs/altproj/include
Hope this helps!
To add for anyone still having this issue. I installed cuda-toolkit from https://anaconda.org/nvidia/cuda-toolkit so I knew that "cuda.h" and "nvrtc.h" were present in my mamba environment's include folder at "/home/zach/mambaforge/envs/myproject/include" but using
mamba env config vars set CUDA_PATH="/home/zach/mambaforge/envs/myproject/"
Did not set the path correctly (unknown reason). I had to invoke it using CONDA:
conda env config vars set CUDA_PATH="/home/zach/mambaforge/envs/myproject/"
I hope this helps anyone that runs into this minor issue. Make sure you reactivate the env. As a side note, the reason for calling conda / mamba for the environment variable is to keep all of my changes (even setting environment variables) isolated from the OS.
I had the same issue (“cannot find -lcuda, -lnvrtc”). Adding path variables didn’t help, because it turns out I have two cuda installation directories: “cuda” and “cuda-12.1”. The shared object files were called “libcuda.so.1” (under /lib/x86_64-linux-gnu) and “libnvrtc.so.12” (under python-venv/…/nvidia/cuda_nvrtc/lib) instead of just “.so” suffix.
I added a symbolic links with the simple suffix at the corresponding directories and it worked well.
I found the actual paths by tracing the build command generated by the pykeops import.
I went down a terrible rabbit hole because of this, but I think I found a generally-applicable solution, thanks in part to this loosely-related stackoverflow post
First you have to find out what happens when ld
tries to use -lnvrtc, so you run:
ld -lnvrtc --verbose
which gave me:
attempt to open /usr/local/lib/x86_64-linux-gnu/libnvrtc.so failed
attempt to open /usr/local/lib/x86_64-linux-gnu/libnvrtc.a failed
...
attempt to open /usr/lib/libnvrtc.so failed
attempt to open /usr/lib/nvrtc.a failed
Which is a list of places ld
looked for libnvrtc.
Now, you can use the where
command or whatever command you want to find where libnvrtc.so
happens to be on your computer, and then you can do a symbolic link with the location you found the library at, and one location where ld
is looking for it:
sudo ln -s /usr/lib/libnvrtc.so /usr/lib/libnvrtc.so
After that, ld -lnvrtc --verbose
gives the same list of places it couldn't find libnvrtc
but then in the middle I find two lines that say:
found libc.so.6 at /lib/x86_64-linux-gnu/libc.so.6
ld-linux-x86-64.so.2 needed by /lib/libnvrtc.so
I just wonder if there's any way ld's -lnvrtc
command could also look at the folder I needed, as perhaps that's a broader issue for people using wsl2.
Hi,
I am trying to install the latest keops but got an error. Specifically, pykeops cannot find
nvrtc
in compilation. Any advice resolving this issue?Ubuntu 20.04.5 LTS Cuda 11.6 Pykeops 2.1