Closed jeanfeydy closed 1 year ago
Hi @jeanfeydy ,
my guess is : your configuration has a proper cuda installed... but no gpu. Not sure we have tested this case.
Hi @bcharlier ,
Thanks for your fast answer! This is indeed what is happening, both on my machine and on Colab. Basically,
from ctypes.util import find_library
find_library("cuda")
works fine and returns "libcuda.so.1"
, because the CUDA files are present.
But:
from ctypes import CDLL
CDLL(find_library("cuda"))
fails with:
OSError Traceback (most recent call last)
[<ipython-input-2-2af85f5067e4>](https://localhost:8080/#) in <cell line: 2>()
1 from ctypes import CDLL
----> 2 CDLL(find_library("cuda"))
[/usr/lib/python3.9/ctypes/__init__.py](https://localhost:8080/#) in __init__(self, name, mode, handle, use_errno, use_last_error, winmode)
372
373 if handle is None:
--> 374 self._handle = _dlopen(self._name, mode)
375 else:
376 self._handle = handle
OSError: libcuda.so.1: cannot open shared object file: No such file or directory
I'm very surprised that I have not encountered the problem before... Probably, this is because no one is ever using the KeOps Docker image (that includes a full CUDA environment) on a GPU-less machine.
There are several ways to fix this in the imports, but I don't know which one you prefer. Do you want to fix it yourself, or should I push something?
See you soon, Jean
Hello @jeanfeydy, @bcharlier,
It should be ok now ; I have done the correction and merged into main. At least on Colab it is ok. Could you check on your other system ?
It seems that the cuda libraries could be detected via the function find_library
from types
, but then could not be loaded because there were not on the system path.
I think I have checked CPU versions of pykeops on Colab for several releases, but maybe not in the past few months... Maybe there was a change in the way Google Colab sets up the paths for the different hardware configurations.
Thanks a lot @joanglaunes , this works great! See you soon, Jean
Hi @joanglaunes , @bcharlier ,
I hope that you are doing well! Suddenly, my KeOps install and Docker container has stopped working on machines that do not have a GPU.
For instance, on Google Colab, if you make sure to use an instance without GPU acceleration, running:
Will fail with:
I assume that fixing the issue shouldn't be too difficult, putting a try-catch structure in
gpu_utils.py
.However, since this part of the code is >12 months old, I don't understand why this error is only showing up today. I'm very confused: do you have any insight?
Best regards, Jean