Allow compute capabilities newer than the compiled version.

berendo commented 4 years ago

Given that the specified compute capability is passed to nvcc via -arch sm_${CUDA_ARCH} in the CMakeLists.txt during the build, the compiled CUDA kernels have real binaries for that specific architecture and PTX binaries for the same compute architecture, allowing for JIT dynamic compilation on newer architectures (see https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#options-for-steering-gpu-code-generation). Therefore the compute capability check should only exclude versions that are older than specified compute capability rather than allowing only exact matches.

vmarkovtsev commented 4 years ago

Good to know this! Thanks for the contribution.

berendo commented 4 years ago

No problem. Thank you for developing this project!

src-d / kmcuda

Allow compute capabilities newer than the compiled version. #97