Closed pauldmccarthy closed 1 year ago
I can't answer on behalf of conda-forge, but I think this being a user responsibility makes sense. Each CUDA release has a range of architectures it supports, and in turn cuDNN targets a range of CUDA versions. cuDNN 8.8.0 (support matrix) was the first release to add CUDA 12, and dropped both CUDA 10.2 and CUDA 11.0-11.6, and support for Kepler hardware.
Independently from the CUDA version, you can still run into issues of compatibility, as the package developers who compile CUDA kernels also need to specify what platforms they support through the cubins that are embedded in their packages. These cubins vary between packages, and don't always match 1:1 with what the underlying CUDA version supports. At a minimum using cuda-version
/ cudatoolkit
that matches what your hardware supports gives you the best chance to identify compatible builds. Kepler is getting old enough that its support lifecycle is coming to and end for many packages as they move to support newer CUDA releases.
Apologies for the delayed response. You are correct, there is at present no way for conda to handle this and it is the user's responsibility to choose a version that is suitable for the architecture. The available CUDA driver may be checked using the virtual __cuda
package, but not the hardware arch.
@vyasr @scdub thanks for your replies! At the moment, it's really just cuDNN 8.8.0 that is the issue, as it requires compute >= 5.0, whereas CUDA 11.* still supports compute >= 3.7. But this is something that I can easily handle when setting up my environments. Thanks!
Comment:
Howdy,
(Apologies if, as I suspect, this question is better suited over at conda/conda and/or mamba-org/mamba)
As it stands, it appears to be possible to install a version of cuDNN which is compatible with the installed CUDA version, but which is incompatible with the compute capability of the available hardware. For example, if I run the following on a system with a Tesla K80 (compute capability 3.7):
I end up with
cuDNN=8.8.0
(latest available on conda-forge at the time of writing), which requires hardware which supports, at minimum, compute capability 5.0:Subsequently, when I try to run some code utilising cuDNN with this environment, I encounter
CUDNN_STATUS_ARCH_MISMATCH
errors.So my question is: is it the responsibility of the user to choose a suitable version of cuDNN which is compatible with their hardware?
Thanks!
*As an aside, my installed GPU driver supports CUDA 11.4, whereas conda/mamba both install
cuda-version
/cudatoolkit
11.8. I initially thought that this might be due to a previously reported bug, but then remembered that NVIDIA have started guaranteeing limited forward-compatibility within major CUDA releases from 11 onwards, so this behaviour appears to be valid.