CrayLabs / SmartSim

SmartSim Infrastructure Library.
BSD 2-Clause "Simplified" License
219 stars 36 forks source link

Allow users to choose between CUDA-11 and CUDA-12 ML Packages #616

Open ashao opened 2 weeks ago

ashao commented 2 weeks ago

Description

Many of the most recent versions of ML packages now support CUDA 12. Some (like Tensorflow) require CUDA-12 exclusively. We should allow users to build the backends against CUDA 12 as well to ensure consistency of GPU stack between the install python package versions and the backends themselves. This is complicated however by the fact that not all packages are retaining support. Hence, there may be bifurcation that the users will have to be able to express based on whether they want CUDA 11 or CUDA 12

Justification

Allow users who want to upgrade to using CUDA 12 (especially for new hardware) and/or users who want to maintain legacy support for CUDA-11.

Implementation Strategy