NVIDIA / deepops

Tools for building GPU clusters
BSD 3-Clause "New" or "Revised" License
1.25k stars 326 forks source link

Bump GPU Operator to v22.9.1 #1250

Closed supertetelman closed 1 year ago

supertetelman commented 1 year ago

Bumped to the latest GPU Operator and supporting tools.

Even though the supported path forward is to only use the GPU Operator, I am continuing to maintain and test the paths using Device Plugin without the operator.

GPU Operator -> v22.9.2 (NGC hosted) Device Plugin -> v0.13.0 GFD -> v0.7.0

Also pushed some changed in here to address the SSL issues introduced by recent updates to the fedoraproject website.

Testing: The existing test already cover GFD+device plugin and GPU Operator installs. It looks like this release added a few additional features and upgrade paths that could eventually be included in our testing harness, but I have not done so for this PR.

Upgrade steps: For existing clusters an upgrade can be done by: