Open jeanfeydy opened 2 years ago
Hi @ismedina,
Thanks again for your report. Here are some hypotheses about your problem:
~/.cache/pykeops2.1/...
. What may happen is that your cache folder currently contains binaries that have been compiled for the GTX980/1080 GPUs but that are not suited for the RTX500 and the V100, which confuses the system. I don't know if we are currently handling heterogeneous configurations as cleanly as we should (@bcharlier, @joanglaunes ?). To see if this is indeed the root cause for your issue, you may try to log on your RTX500 or V100 GPU and try to run:import pykeops
# Clear ~/.cache/pykeops2.1/...
pykeops.clean_pykeops()
# Rebuild from scratch the required binaries
pykeops.test_torch_bindings()
Depending on your answers, we will try to investigate further :-) Best regards, Jean
Hi Jean,
it was the first thing :) cleaning the compiled binaries when changing GPU solved the issue. Thanks a lot!
Best, Ismael
Hi @ismedina,
Great, thank you! I assume that @joanglaunes or @bcharlier will know how to fix this cleanly after the summer holidays :-) Best regards, Jean
(This issue is transferred from https://github.com/jeanfeydy/geomloss/issues/66, opened by @ismedina)
I am trying to use geomloss in the computer cluster at my institution. I can choose between several computing nodes with different GPUs. geomloss seems to work seamlessly on some GPUs (GTX980, GTX1080), but on others (RTX500, V100) I get the following error when running the sample code at geomloss webpage:
I am running the code on a Linux machine with Python 3.8, the latest version of geomloss and CUDA 11.5. Do you have any tips?
Thanks a lot in advance :)