Bump GPU Operator to v22.9.1

Bumped to the latest GPU Operator and supporting tools.

Even though the supported path forward is to only use the GPU Operator, I am continuing to maintain and test the paths using Device Plugin without the operator.

GPU Operator -> v22.9.2 (NGC hosted) Device Plugin -> v0.13.0 GFD -> v0.7.0

Also pushed some changed in here to address the SSL issues introduced by recent updates to the fedoraproject website.

Testing: The existing test already cover GFD+device plugin and GPU Operator installs. It looks like this release added a few additional features and upgrade paths that could eventually be included in our testing harness, but I have not done so for this PR.

Upgrade steps: For existing clusters an upgrade can be done by:

Tainting all GPU nodes as NoSchedule and evacuating all running GPU workloads or waiting for them to complete
helm delete -n gpu-operator-resources nvidia-gpu-operator
kubectl delete crd clusterpolicies.nvidia.com
ansible-playbook playbooks/k8s-cluster/nvidia-gpu-operator.yml
Untaint the nodes See the official docs for a long list of upgrade steps: https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/getting-started.html#upgrade

NVIDIA / deepops

Bump GPU Operator to v22.9.1 #1250