Closed jcfr closed 4 months ago
@NicerNewerCar After you confirm configuring and building the extension successfully discover the CUDA toolkit, I will integrate the corresponding Autoscoper changes^1 and updated the repository URL and SHA referenced in this pull request. If you installed CUDA along with the "Visual Studio Integration" module, it should find the latest one by default.
Note that when using the Visual Studio generator, specifying CUDACXX
will not work and instead it is expected to specify -T cuda=X.Y
or -T cuda=$env:CUDA_PATH_vXX_X
as you initially indicated^2
@jcfr The extension was able to find the specified version of CUDA
@amymmorton With this recent update, we are explicitly compiling the CUDA kernels for all architectures from 5.0 to 8.0. This means that if you are running Autoscoper on systems with these GPUs, there will be no JIT compilation and the kernels explicitly built will be used:
As indicated in the "Building for Maximum Compatibility" section^1 of the Best Practices Guide:
By compiling for the native compute capability for these GPU(s), we will ensure that application kernels achieve the best possible performance and are able to use the features that are available on a given generation of the GPU
Prior to these changes the architecture used to build the kernels was defaulting to the one associated with the the CUDA compiler (nvcc).
References:
This commit updates the Autoscoper project by replacing the deprecated^1
FindCUDA
CMake module with the more robust and first-class support for CUDA language. The changes leverage the following key features introduced in various CMake versions:3.8: Support by the Makefile Generators and the Ninja generator on Linux, macOS, and Windows See https://cmake.org/cmake/help/latest/release/3.8.html#cuda\
3.9: Support by the Visual Studio Generators for VS 2010 and above. See https://cmake.org/cmake/help/latest/release/3.9.html#languages
3.17: Introduction of FindCUDAToolkit See https://cmake.org/cmake/help/latest/release/3.17.html#modules and https://cmake.org/cmake/help/latest/module/FindCUDAToolkit.html#module:FindCUDAToolkit
3.18: A CMAKE_CUDA_ARCHITECTURES variable was added to specify CUDA output architectures. See https://cmake.org/cmake/help/latest/release/3.18.html#variables and https://cmake.org/cmake/help/latest/policy/CMP0104.html#policy:CMP0104
3.19: If CUDA compiler detection fails with user-specified CMAKE_CUDA_ARCHITECTURES or CMAKE_CUDA_HOST_COMPILER, an error is raised. See https://cmake.org/cmake/help/latest/release/3.19.html#other-changes
3.20: The CUDAARCHS environment variable was added for initializing CMAKE_CUDA_ARCHITECTURES. Useful in cases where the compiler default is unsuitable for the machine's GPU. See https://cmake.org/cmake/help/latest/release/3.20.html#languages
3.23: The CMAKE_CUDA_ARCHITECTURES variable and associated CUDA_ARCHITECTURES target property now support the all, and all-major values for CUDA toolkit 7.0+. See https://cmake.org/cmake/help/latest/release/3.23.html#variables
List of Autoscoper changes: