BrownBiomechanics / SlicerAutoscoperM

This 3D Slicer extension enables users to perform image registration.
https://autoscoperm.slicer.org
MIT License
0 stars 3 forks source link

ENH: Update Autoscoper modernizing CUDA integration #81

Closed jcfr closed 4 months ago

jcfr commented 4 months ago

This commit updates the Autoscoper project by replacing the deprecated^1 FindCUDA CMake module with the more robust and first-class support for CUDA language. The changes leverage the following key features introduced in various CMake versions:

List of Autoscoper changes:

$ git shortlog 7df4365..22b1a41e0 --no-merges
Jean-Christophe Fillion-Robin (4):
      COMP: Update minimum CMake version from 3.8 to 3.17.5
      COMP: Modernize CUDA integration using FindCUDAToolkit CMake module
      COMP: Update minimum CMake version from 3.17.5 to 3.20.6
      COMP: Initialize list of CUDA architectures for maximum compatibility
jcfr commented 4 months ago

@NicerNewerCar After you confirm configuring and building the extension successfully discover the CUDA toolkit, I will integrate the corresponding Autoscoper changes^1 and updated the repository URL and SHA referenced in this pull request. If you installed CUDA along with the "Visual Studio Integration" module, it should find the latest one by default.

Note that when using the Visual Studio generator, specifying CUDACXX will not work and instead it is expected to specify -T cuda=X.Y or -T cuda=$env:CUDA_PATH_vXX_X as you initially indicated^2

NicerNewerCar commented 4 months ago

@jcfr The extension was able to find the specified version of CUDA

jcfr commented 4 months ago

@amymmorton With this recent update, we are explicitly compiling the CUDA kernels for all architectures from 5.0 to 8.0. This means that if you are running Autoscoper on systems with these GPUs, there will be no JIT compilation and the kernels explicitly built will be used:

As indicated in the "Building for Maximum Compatibility" section^1 of the Best Practices Guide:

By compiling for the native compute capability for these GPU(s), we will ensure that application kernels achieve the best possible performance and are able to use the features that are available on a given generation of the GPU

Prior to these changes the architecture used to build the kernels was defaulting to the one associated with the the CUDA compiler (nvcc).

References: