Open jggc opened 3 months ago
I've fixed this by depending on tch
version 0.15
and adding tch::maybe_init_cuda()
to the start of main()
. This seems to stop the linker from removing the libtorch_cuda
dependency, which is what causes that error message (at least in my case).
This problem could definitely be documented better; it took me a couple hours to figure this out.
I agree that when running into issues with tch
, the actual error is never really clear.
What happens the most often is trying to use the CUDA version when the environment variable was set in another shell (not persistent), so you try to run your program and you get an error similar to the one you posted. Cargo is all sorts of confused and the resolution on tch-rs
based on the changes to the environment variable never seemed to work for me, so I end up cleaning the cache and rebuilding the package.
We tried to improve the setup but the environment variables are required by tch-rs
, so it is not as straightforward to circumvent (I tried). We could definitely add some documentation for common issues at the very least. The best we can do about the error message from the torch side is probably just try to match the generic error message and give some tips/cues.
We're open to suggestions!
Describe the bug This is not actually clear whether this is a bug or a feature/documentation request but here it goes:
Running rust nightly 2024-05-30, no matter how I set up libtorch I will end up with
The reason I am reporting is that this is at least the third time that I encounter this same issue for different reasons such as :
What is my point I think this error is totally unhelpful and there is a loot of room for improvement regarding the setup tch-gpu.
What are you thinking ?
Should we :