Closed ityonemo closed 8 months ago
Can you send the logs around that internal error?
And please include your relevant XLA_TARGETs :)
thanks! Which logs should I send?
XLA_TARGET=cuda120
Thank you, and what is the CUDNN version?
You can also try building XLA from source and see if you have better luck.
unfortunately, building XLA from source stopped with Inconsistent CUDA toolkit path: /usr vs /usr/lib
possibly because I switched from 11.8 to 12.x?
I actually can't figure out how to find out what cudnn version I have directly. Some instructions on how to determine these in the readme might be helpful. I'l make a pr. Also a lot of people don't know this, but nvidia-smi will lie about the cuda version (the only way to know for sure is nvcc -V).
@ityonemo what OS do you use? On Debian/Ubuntu you can usually find cuDNN package version with apt-cache policy libcudnn8
.
Unable to locate package libcudnn8
I guess i don't have cudnn installed. Or i might have accidentally wiped it when i purged 11.8 =(
Ok, thanks. I think we can close this, will reopen if i install cudnn and can't get it working
I had some serious struggles with cuda 11.8 (Exla-0.6 fails on this platform) so I upgraded to Cuda 12, but I wound up with 12.2:
This seems to cause CUDNN_STATUS_INTERNAL_ERROR.
Downgrading to 11.8 and Exla-0.5 works (but other libraries, e.g. Bumblebee) fail on Exla-0.5