NVIDIA / cuda-python

CUDA Python Low-level Bindings
https://nvidia.github.io/cuda-python/
Other
874 stars 71 forks source link

`cuda.cudart.getLocalRuntimeVersion()` raises `RuntimeError: Failed to dlopen libcudart.so.12` #89

Open Matt711 opened 1 month ago

Matt711 commented 1 month ago

Is this a bug? getLocalRuntimeVersion() fails for me in cuda 11.8 environment. I'm asking because I see that the API call is in the cuda-python 11.8 release notes.

In the source code, we're hard coding libcudart.so.12. Is that right?

Repro

In [1]: from cuda import cudart

In [2]: cudart.getLocalRuntimeVersion()
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[2], line 1
----> 1 cudart.getLocalRuntimeVersion()

File ~/.conda/envs/rapids/lib/python3.11/site-packages/cuda/cudart.pyx:24961, in cuda.cudart.getLocalRuntimeVersion()

File ~/.conda/envs/rapids/lib/python3.11/site-packages/cuda/ccudart.pyx:2365, in cuda.ccudart.getLocalRuntimeVersion()

File ~/.conda/envs/rapids/lib/python3.11/site-packages/cuda/_lib/ccudart/ccudart.pyx:2121, in cuda._lib.ccudart.ccudart._getLocalRuntimeVersion()

RuntimeError: Failed to dlopen libcudart.so.12
Matt711 commented 1 month ago

xref rmm/1675

leofang commented 1 month ago

It seems to be a backport mistake that we should fix: https://github.com/NVIDIA/cuda-python/blob/64cc9ae081ba405972b0472cd1fd35b919c455fc/cuda/_lib/ccudart/ccudart.pyx.in#L2451-L2457 @Matt711 how urgent is this?

Matt711 commented 1 month ago

@Matt711 how urgent is this?

Not urgent. We already have a workaround using numba.cuda. I also don't mind working on this @leofang, if you could point me in the right direction.

leofang commented 1 month ago

Thanks, @Matt711. The offending code that I linked to above is from the 11.8.x branch, so ideally we can just fix the lines referencing libcudart.so.12 to .11. But we're transitioning to a new development/release process so let me check with @vzhurba01 later today first, and get back to you later.

leofang commented 1 month ago

@Matt711 we discussed and will try to get a new 11.8.x release out next week, with this bug fixed and perhaps also #75 backported.

Matt711 commented 1 month ago

Thanks @leofang

wence- commented 4 weeks ago

@leofang, @vzhurba01 did this backport/release occur?

vzhurba01 commented 4 weeks ago

Not yet. The wheels and conda packages are currently going through pre-release validation. I'll update this issue once posting is complete.

vzhurba01 commented 3 weeks ago

FYI I've updated the repo with the fix under the patch release 11.8.4 (tag v11.8.4).

I created new issue #139 to track the wheels/conda uploads for this patch release. I'm thinking of keeping this current issue open though until they are uploaded, and then give a notice before closing.