Open HuuDatDo opened 1 year ago
Hi @HuuDatDo,
Thanks for your interest in our library!
Could you tell us more about the way CUDA is "hidden" to Docker on your setup? We maintain a reference Docker image: https://hub.docker.com/r/getkeops/keops-full Which is documented here: https://github.com/getkeops/keops/blob/main/Dockerfile
(I will update it soon to catch up with the latest version of PyTorch.)
Install instructions and a typical use case (= rendering the www.kernel-operations.io website) are described here: http://kernel-operations.io/keops/python/installation.html#using-docker-or-singularity.
Best regards, Jean
Hi @jeanfeydy
Thank you so much for your reply!
According to my lab manager, I'm not allowed to run anything related to cuda because it would affect other users, the cuda folder is not visible to any docker, so in the path 'usr/local/' there is no cuda folder. I still can run nvidia-smi so it should exist. I will ask again if I can create a new docker based on the documentation.
Best, Huu Dat
Hi @jeanfeydy
Thanks to your instructions, I could build a new container and define the cuda path. However, when I run pykeops.test_numpy_bindings()
, I encountered this bug:
Traceback (most recent call last):
File "test.py", line 10, in <module>
pykeops.test_numpy_bindings()
File "/opt/conda/envs/stsum/lib/python3.8/site-packages/pykeops/numpy/test_install.py", line 20, in test_numpy_bindings
if np.allclose(my_conv(x, y).flatten(), expected_res):
File "/opt/conda/envs/stsum/lib/python3.8/site-packages/pykeops/numpy/generic/generic_red.py", line 303, in __call__
self.myconv = keops_binder["nvrtc" if tagCPUGPU else "cpp"](
File "/opt/conda/envs/stsum/lib/python3.8/site-packages/keopscore/utils/Cache.py", line 68, in __call__
obj = self.cls(*args)
File "/opt/conda/envs/stsum/lib/python3.8/site-packages/pykeops/common/keops_io/LoadKeOps_nvrtc.py", line 15, in __init__
super().__init__(*args, fast_init=fast_init)
File "/opt/conda/envs/stsum/lib/python3.8/site-packages/pykeops/common/keops_io/LoadKeOps.py", line 31, in __init__
self.init_phase2()
File "/opt/conda/envs/stsum/lib/python3.8/site-packages/pykeops/common/keops_io/LoadKeOps_nvrtc.py", line 20, in init_phase2
pykeops_nvrtc = importlib.import_module("pykeops_nvrtc")
File "/opt/conda/envs/stsum/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 657, in _load_unlocked
File "<frozen importlib._bootstrap>", line 556, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 1166, in create_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: /root/.cache/keops2.1/build/pykeops_nvrtc.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNSt15__exception_ptr13exception_ptr10_M_releaseEv
Do you know any solutions to fix this because some previous similar issues were fixed with pykeops version 1.4.2?
Best, Huu Dat
I'm trying to run pykeops on a docker container connected to a GPUs server that hides the cuda folder. I tried to install the conda environment from #85 as well as the .conda folder but none of them worked. Are there any other ways to run pykeops on this kind of system?