Given that the specified compute capability is passed to nvcc via -arch sm_${CUDA_ARCH} in the CMakeLists.txt during the build, the compiled CUDA kernels have real binaries for that specific architecture and PTX binaries for the same compute architecture, allowing for JIT dynamic compilation on newer architectures (see https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#options-for-steering-gpu-code-generation). Therefore the compute capability check should only exclude versions that are older than specified compute capability rather than allowing only exact matches.
Given that the specified compute capability is passed to
nvcc
via-arch sm_${CUDA_ARCH}
in theCMakeLists.txt
during the build, the compiled CUDA kernels have real binaries for that specific architecture and PTX binaries for the same compute architecture, allowing for JIT dynamic compilation on newer architectures (see https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#options-for-steering-gpu-code-generation). Therefore the compute capability check should only exclude versions that are older than specified compute capability rather than allowing only exact matches.