getkeops / keops

KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows
https://www.kernel-operations.io
MIT License
1.04k stars 64 forks source link

Problem finding libraries with ctypes (python_engine branch) #202

Closed adam-coogan closed 1 year ago

adam-coogan commented 2 years ago

I have been testing the python_engine branch (see https://github.com/getkeops/keops/issues/58#issuecomment-975991025). I encountered a bug installing it on one of the clusters I work on: ctypes is not able to find the library libnvrtc.so.11.2. In more detail, after I install with

 pip install git+https://github.com/getkeops/keops.git@python_engine

importing keops gives the following error message:

>>> import keops
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/amcoogan/.virtualenvs/lensing-3.9.6/lib/python3.9/site-packages/keops/__init__.py", line 3, in <module>
    from keops.config.config import get_build_folder, set_build_folder
  File "/home/amcoogan/.virtualenvs/lensing-3.9.6/lib/python3.9/site-packages/keops/config/config.py", line 76, in <module>
    from keops.utils.gpu_utils import get_gpu_props
  File "/home/amcoogan/.virtualenvs/lensing-3.9.6/lib/python3.9/site-packages/keops/utils/gpu_utils.py", line 14, in <module>
    libnvrtc_folder = os.path.dirname(find_library_abspath("nvrtc"))
  File "/home/amcoogan/.virtualenvs/lensing-3.9.6/lib/python3.9/site-packages/keops/utils/misc_utils.py", line 45, in find_library_abspath
    lib = CDLL(res)
  File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.9.6/lib/python3.9/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libnvrtc.so.11.2: cannot open shared object file: No such file or directory

One of the sysadmins on the cluster I use suggested you make the following change: use the environment variable CUDA_HOME to find the absolute path to the library. In more detail, the following (loading with the absolute path to the libary) works:

>>> from ctypes import CDLL
>>> import os
>>> CDLL(os.path.join(os.environ['CUDA_HOME'],'lib', 'libnvrtc.so'))
<CDLL '/some/path/here/libnvrtc.so', handle 253ca60 at 0x2ac7f9f02460>

In constrast, the following using find_library (which keops uses in misc_utils.py) gives an error:

>>> from ctypes import CDLL
>>> import os
>>> CDLL(os.path.join(os.environ['CUDA_HOME'],'lib', 'libnvrtc.so'))
<CDLL '/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/cudacore/11.4.2/lib/libnvrtc.so', handle 1d2a9d0 at 0x2ba3a0b32d00>
>>> from ctypes.util import find_library
>>>
(lensing-3.9.6) amcoogan@beluga5:~$ python
Python 3.9.6 (default, Jul 12 2021, 18:24:27)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from ctypes import CDLL
>>> from ctypes.util import find_library
>>> find_library("nvrtc")
'libnvrtc.so.11.2'
>>> CDLL(find_library('nvrtc'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.9.6/lib/python3.9/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libnvrtc.so.11.2: cannot open shared object file: No such file or directory

So it would be great if you could make this change. Thanks!

joanglaunes commented 1 year ago

closing as non relevant now