Closed jeremy-london closed 1 year ago
Noticed the 525 version of the driver container was pushed yesterday and tried it out -- same issue.. i suspect the package cache might be getting hit with a SSL warning -- just not sure as not logs indicate
root@rke2-server:~/gpu-operator# kubectl logs -f nvidia-driver-daemonset-4zzgx --all-containers=true -n gpu-operator
Getting current value of the 'nvidia.com/gpu.deploy.operator-validator' node label
Current value of 'nvidia.com/gpu.deploy.operator-validator=true'
Getting current value of the 'nvidia.com/gpu.deploy.container-toolkit' node label
Current value of 'nvidia.com/gpu.deploy.container-toolkit=true'
Getting current value of the 'nvidia.com/gpu.deploy.device-plugin' node label
Current value of 'nvidia.com/gpu.deploy.device-plugin=true'
Getting current value of the 'nvidia.com/gpu.deploy.gpu-feature-discovery' node label
Current value of 'nvidia.com/gpu.deploy.gpu-feature-discovery=true'
Getting current value of the 'nvidia.com/gpu.deploy.dcgm-exporter' node label
Current value of 'nvidia.com/gpu.deploy.dcgm-exporter=true'
Getting current value of the 'nvidia.com/gpu.deploy.dcgm' node label
Current value of 'nvidia.com/gpu.deploy.dcgm=true'
Getting current value of the 'nvidia.com/gpu.deploy.mig-manager' node label
Current value of 'nvidia.com/gpu.deploy.mig-manager='
Getting current value of the 'nvidia.com/gpu.deploy.nvsm' node label
Current value of 'nvidia.com/gpu.deploy.nvsm='
Getting current value of the 'nvidia.com/gpu.deploy.sandbox-validator' node label
Current value of 'nvidia.com/gpu.deploy.sandbox-validator='
Getting current value of the 'nvidia.com/gpu.deploy.sandbox-device-plugin' node label
Current value of 'nvidia.com/gpu.deploy.sandbox-device-plugin='
Getting current value of the 'nvidia.com/gpu.deploy.vgpu-device-manager' node label
Current value of 'nvidia.com/gpu.deploy.vgpu-device-manager='
Getting current value of the 'nodeType' node label(used by NVIDIA Fleet Command)
Current value of 'nodeType='
Shutting GPU Operator components that must be restarted on driver restarts by disabling their component-specific nodeSelector labels
node/rke2-agent.[readactedurl].com labeled
Waiting for the operator-validator to shutdown
pod/nvidia-operator-validator-ns5t8 condition met
unbinding device 0000:03:00.0
unbinding device 0000:05:00.0
unbinding device 0000:0d:00.0
unbinding device 0000:16:00.0
Uncordoning node rke2-agent.[readactedurl].com...
node/rke2-agent.[readactedurl].com already uncordoned
Rescheduling all GPU clients on the current node by enabling their component-specific nodeSelector labels
node/rke2-agent.[readactedurl].com labeled
DRIVER_ARCH is x86_64
Creating directory NVIDIA-Linux-x86_64-525.60.13
Verifying archive integrity... OK
Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 525.60.13...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
WARNING: Unable to determine the default X library path. The path /tmp/null/lib will be used, but this path was not detected in the ldconfig(8) cache, and no directory exists at this path, so it is likely that libraries installed there will not be found by the loader.
WARNING: You specified the '--no-kernel-modules' command line option, nvidia-installer will not install any kernel modules as part of this driver installation, and it will not remove existing NVIDIA kernel modules not part of an earlier NVIDIA driver installation. Please ensure that NVIDIA kernel modules matching this driver version are installed separately.
========== NVIDIA Software Installer ==========
Starting installation of NVIDIA driver version 525.60.13 for Linux kernel version 5.4.0-125-generic
Stopping NVIDIA persistence daemon...
Unloading NVIDIA driver kernel modules...
Unmounting NVIDIA driver rootfs...
Checking NVIDIA driver packages...
Updating the package cache...
E: Release file for http://archive.ubuntu.com/ubuntu/dists/focal-updates/InRelease is not valid yet (invalid for another 4h 32min 40s). Updates for this repository will not be applied.
E: Release file for http://archive.ubuntu.com/ubuntu/dists/focal-security/InRelease is not valid yet (invalid for another 4h 3min 56s). Updates for this repository will not be applied.
Stopping NVIDIA persistence daemon...
Unloading NVIDIA driver kernel modules...
Unmounting NVIDIA driver rootfs...
@jeremy-london Can you double check if the timezone is correctly configured on your node?
@shivamerla I think this is where i am leaning as well -- it appears there is some sort of apt-get update going on in there.. and I found apt-get update on the host shows the same conditions
I update the tzdata and that seemed to fix the host -- ill report back if it fixes the containerd runtime here
Possibly setting TZ or /etc/timezone in the container would solve it, but ultimately if it respects the node its running on then ill get each configured
Changed some configs around -- product required a FIPs compliant kernel so had to rebuild today
DISA STIG + FIPS-updates enabled
Got things back to the same state and re-ran the nvidia-gpu-operator
root@rke2-server:~# kubectl logs -f nvidia-driver-daemonset-8l6jj --all-containers=true -n gpu-operator
Getting current value of the 'nvidia.com/gpu.deploy.operator-validator' node label
Current value of 'nvidia.com/gpu.deploy.operator-validator=true'
Getting current value of the 'nvidia.com/gpu.deploy.container-toolkit' node label
Current value of 'nvidia.com/gpu.deploy.container-toolkit=true'
DRIVER_ARCH is x86_64
Getting current value of the 'nvidia.com/gpu.deploy.device-plugin' node label
Creating directory NVIDIA-Linux-x86_64-525.60.13
Current value of 'nvidia.com/gpu.deploy.device-plugin=true'
Verifying archive integrity... OK
Getting current value of the 'nvidia.com/gpu.deploy.gpu-feature-discovery' node label
Current value of 'nvidia.com/gpu.deploy.gpu-feature-discovery=true'
Getting current value of the 'nvidia.com/gpu.deploy.dcgm-exporter' node label
Current value of 'nvidia.com/gpu.deploy.dcgm-exporter=true'
Getting current value of the 'nvidia.com/gpu.deploy.dcgm' node label
Current value of 'nvidia.com/gpu.deploy.dcgm=true'
Getting current value of the 'nvidia.com/gpu.deploy.mig-manager' node label
Current value of 'nvidia.com/gpu.deploy.mig-manager='
Getting current value of the 'nvidia.com/gpu.deploy.nvsm' node label
Current value of 'nvidia.com/gpu.deploy.nvsm='
Getting current value of the 'nvidia.com/gpu.deploy.sandbox-validator' node label
Current value of 'nvidia.com/gpu.deploy.sandbox-validator='
Getting current value of the 'nvidia.com/gpu.deploy.sandbox-device-plugin' node label
Current value of 'nvidia.com/gpu.deploy.sandbox-device-plugin='
Getting current value of the 'nvidia.com/gpu.deploy.vgpu-device-manager' node label
Current value of 'nvidia.com/gpu.deploy.vgpu-device-manager='
Getting current value of the 'nodeType' node label(used by NVIDIA Fleet Command)
Current value of 'nodeType='
Shutting GPU Operator components that must be restarted on driver restarts by disabling their component-specific nodeSelector labels
node/rke2-agent.[redactedurl].com labeled
Waiting for the operator-validator to shutdown
pod/nvidia-operator-validator-v595f condition met
unbinding device 0000:03:00.0
unbinding device 0000:0c:00.0
unbinding device 0000:15:00.0
unbinding device 0000:1e:00.0
Uncordoning node rke2-agent.[redactedurl].com...
node/rke2-agent.[redactedurl].com already uncordoned
Rescheduling all GPU clients on the current node by enabling their component-specific nodeSelector labels
node/rke2-agent.[redactedurl].com labeled
Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 525.60.13...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
WARNING: Unable to determine the default X library path. The path /tmp/null/lib will be used, but this path was not detected in the ldconfig(8) cache, and no directory exists at this path, so it is likely that libraries installed there will not be found by the loader.
WARNING: You specified the '--no-kernel-modules' command line option, nvidia-installer will not install any kernel modules as part of this driver installation, and it will not remove existing NVIDIA kernel modules not part of an earlier NVIDIA driver installation. Please ensure that NVIDIA kernel modules matching this driver version are installed separately.
========== NVIDIA Software Installer ==========
Starting installation of NVIDIA driver version 525.60.13 for Linux kernel version 5.4.0-1068-fips
Stopping NVIDIA persistence daemon...
Unloading NVIDIA driver kernel modules...
Unmounting NVIDIA driver rootfs...
Checking NVIDIA driver packages...
Updating the package cache...
Resolving Linux kernel version...
Could not resolve Linux kernel version
Stopping NVIDIA persistence daemon...
Unloading NVIDIA driver kernel modules...
Unmounting NVIDIA driver rootfs...
root@rke2-server:~# uname -r
5.4.0-1068-fips
Seeing the error Could not resolve Linux kernel version -- Is this kernel not supported?
@jeremy-london the error is from here. Can you run below command on the node to make sure kernel-headers are available for this kernel?
KERNEL_VERSION=5.4.0-1068-fips && \
apt-cache show "linux-headers-${KERNEL_VERSION}" 2> /dev/null | \
sed -nE 's/^Version:\s+(([0-9]+\.){2}[0-9]+)[-.]([0-9]+).*/\1-\3/p' | head -1
root@rke2-server:~# KERNEL_VERSION=5.4.0-1068-fips && \
> apt-cache show "linux-headers-${KERNEL_VERSION}" 2> /dev/null | \
> sed -nE 's/^Version:\s+(([0-9]+\.){2}[0-9]+)[-.]([0-9]+).*/\1-\3/p' | head -1
5.4.0-1068
@shivamerla looks like we need to make ubuntu advantage repositories configured on the host accessible to the driver container. Please follow the instructions here to create a ConfigMap with these repositories and injecting them into driver container.
Moreover - I just tested again with 515-signed hoping that might support FIPs kernels.. but no dice
root@rke2-server:~# kubectl logs -f nvidia-driver-daemonset-2g8hk --all-containers=true -n gpu-operator
========== NVIDIA Software Installer ==========
Starting installation of NVIDIA driver branch 515 for Linux kernel version 5.4.0-1068-fips
Stopping NVIDIA persistence daemon...
Unloading NVIDIA driver kernel modules...
Unmounting NVIDIA driver rootfs...
Updating the package cache...
Getting current value of the 'nvidia.com/gpu.deploy.operator-validator' node label
Current value of 'nvidia.com/gpu.deploy.operator-validator=true'
Getting current value of the 'nvidia.com/gpu.deploy.container-toolkit' node label
Current value of 'nvidia.com/gpu.deploy.container-toolkit=true'
Getting current value of the 'nvidia.com/gpu.deploy.device-plugin' node label
Current value of 'nvidia.com/gpu.deploy.device-plugin=true'
Getting current value of the 'nvidia.com/gpu.deploy.gpu-feature-discovery' node label
Current value of 'nvidia.com/gpu.deploy.gpu-feature-discovery=true'
Getting current value of the 'nvidia.com/gpu.deploy.dcgm-exporter' node label
Current value of 'nvidia.com/gpu.deploy.dcgm-exporter=true'
Getting current value of the 'nvidia.com/gpu.deploy.dcgm' node label
Current value of 'nvidia.com/gpu.deploy.dcgm=true'
Getting current value of the 'nvidia.com/gpu.deploy.mig-manager' node label
Current value of 'nvidia.com/gpu.deploy.mig-manager='
Getting current value of the 'nvidia.com/gpu.deploy.nvsm' node label
Current value of 'nvidia.com/gpu.deploy.nvsm='
Getting current value of the 'nvidia.com/gpu.deploy.sandbox-validator' node label
Current value of 'nvidia.com/gpu.deploy.sandbox-validator='
Getting current value of the 'nvidia.com/gpu.deploy.sandbox-device-plugin' node label
Current value of 'nvidia.com/gpu.deploy.sandbox-device-plugin='
Getting current value of the 'nvidia.com/gpu.deploy.vgpu-device-manager' node label
Current value of 'nvidia.com/gpu.deploy.vgpu-device-manager='
Getting current value of the 'nodeType' node label(used by NVIDIA Fleet Command)
Current value of 'nodeType='
Shutting GPU Operator components that must be restarted on driver restarts by disabling their component-specific nodeSelector labels
node/rke2-agent.[redactedurl].com labeled
Waiting for the operator-validator to shutdown
pod/nvidia-operator-validator-j799c condition met
unbinding device 0000:03:00.0
unbinding device 0000:0c:00.0
unbinding device 0000:15:00.0
unbinding device 0000:1e:00.0
Uncordoning node rke2-agent.[redactedurl].com...
node/rke2-agent.[redactedurl].com already uncordoned
Rescheduling all GPU clients on the current node by enabling their component-specific nodeSelector labels
node/rke2-agent.[redactedurl].com labeled
Installing NVIDIA driver kernel modules...
Hit:1 http://us.archive.ubuntu.com/ubuntu focal-updates InRelease
Hit:2 http://archive.ubuntu.com/ubuntu focal InRelease
Hit:3 http://us.archive.ubuntu.com/ubuntu focal-security InRelease
Hit:4 http://archive.ubuntu.com/ubuntu focal-updates InRelease
Hit:5 http://archive.ubuntu.com/ubuntu focal-security InRelease
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package linux-objects-nvidia-515-server-5.4.0-1068-fips
E: Couldn't find any package by glob 'linux-objects-nvidia-515-server-5.4.0-1068-fips'
E: Couldn't find any package by regex 'linux-objects-nvidia-515-server-5.4.0-1068-fips'
E: Unable to locate package linux-signatures-nvidia-5.4.0-1068-fips
E: Couldn't find any package by glob 'linux-signatures-nvidia-5.4.0-1068-fips'
E: Couldn't find any package by regex 'linux-signatures-nvidia-5.4.0-1068-fips'
E: Unable to locate package linux-modules-nvidia-515-server-5.4.0-1068-fips
E: Couldn't find any package by glob 'linux-modules-nvidia-515-server-5.4.0-1068-fips'
E: Couldn't find any package by regex 'linux-modules-nvidia-515-server-5.4.0-1068-fips'
Stopping NVIDIA persistence daemon...
Unloading NVIDIA driver kernel modules...
Unmounting NVIDIA driver rootfs...
If this kernel is not supported.. what options exist? Driver install on the host directly, maybe toolkit as well if similar issues creep up?
@jeremy-london can you follow instructions from comment https://github.com/NVIDIA/gpu-operator/issues/457#issuecomment-1343449476. Yes another option is to pre-install drivers on the host in this case. Container-Toolkit doesn't need to be pre-installed as it doesn't have kernel specific runtime dependencies like the driver container.
@shivamerla Seems we are getting closer --
I added the following to a file, then created the config map/helm update with the new settings
deb https://esm.ubuntu.com/cis/ubuntu focal main
# deb-src https://esm.ubuntu.com/cis/ubuntu focal main
deb https://esm.ubuntu.com/infra/ubuntu focal-infra-security main
# deb-src https://esm.ubuntu.com/infra/ubuntu focal-infra-security main
deb https://esm.ubuntu.com/infra/ubuntu focal-infra-updates main
# deb-src https://esm.ubuntu.com/infra/ubuntu focal-infra-updates main
deb https://esm.ubuntu.com/fips-updates/ubuntu focal-updates main
# deb-src https://esm.ubuntu.com/fips-updates/ubuntu focal-updates main
(That's all the extra ones i have on the host)
Now im dealing with a few other packages not streaming in but seeing an nvidia specific one and wondering if the kernel version is in the support pool for those packages
========== NVIDIA Software Installer ==========
Starting installation of NVIDIA driver branch 515 for Linux kernel version 5.4.0-1068-fips
Stopping NVIDIA persistence daemon...
Unloading NVIDIA driver kernel modules...
Unmounting NVIDIA driver rootfs...
Updating the package cache...
Installing NVIDIA driver kernel modules...
Hit:1 http://us.archive.ubuntu.com/ubuntu focal-updates InRelease
Hit:2 http://us.archive.ubuntu.com/ubuntu focal-security InRelease
Hit:3 http://archive.ubuntu.com/ubuntu focal InRelease
Hit:4 http://archive.ubuntu.com/ubuntu focal-updates InRelease
Hit:5 http://archive.ubuntu.com/ubuntu focal-security InRelease
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package linux-objects-nvidia-515-server-5.4.0-1068-fips
E: Couldn't find any package by glob 'linux-objects-nvidia-515-server-5.4.0-1068-fips'
E: Couldn't find any package by regex 'linux-objects-nvidia-515-server-5.4.0-1068-fips'
E: Unable to locate package linux-signatures-nvidia-5.4.0-1068-fips
E: Couldn't find any package by glob 'linux-signatures-nvidia-5.4.0-1068-fips'
E: Couldn't find any package by regex 'linux-signatures-nvidia-5.4.0-1068-fips'
E: Unable to locate package linux-modules-nvidia-515-server-5.4.0-1068-fips
E: Couldn't find any package by glob 'linux-modules-nvidia-515-server-5.4.0-1068-fips'
E: Couldn't find any package by regex 'linux-modules-nvidia-515-server-5.4.0-1068-fips'
Stopping NVIDIA persistence daemon...
Unloading NVIDIA driver kernel modules...
Unmounting NVIDIA driver rootfs...
@jeremy-london yes, that is correct, precompiled packages are not available for this kernel. Please use driver.version
as 525.60.13
instead of 515-signed
.
1. Quick Debug Checklist
i2c_core
andipmi_msghandler
loaded on the nodes?root@rke2-server:~# lsmod | grep -i ipmi_msghandler ipmi_msghandler 106496 1 ipmi_devintf
Default values for gpu-operator.
This is a YAML-formatted file.
Declare variables to be passed into your templates.
platform: openshift: false
nfd: enabled: true
psp: enabled: true
sandboxWorkloads: enabled: true defaultWorkload: "container"
daemonsets: priorityClassName: system-node-critical tolerations:
validator: repository: [readactedurl].com/ext.nvcr.io/nvidia/cloud-native image: gpu-operator-validator version: "v22.9.0" imagePullPolicy: IfNotPresent imagePullSecrets: [] env: [] args: [] resources: {} plugin: env:
operator: repository: [readactedurl].com/ext.nvcr.io/nvidia image: gpu-operator version: "v22.9.0" imagePullPolicy: IfNotPresent imagePullSecrets: [] priorityClassName: system-node-critical defaultRuntime: containerd runtimeClass: nvidia use_ocp_driver_toolkit: false
cleanup CRD on chart un-install
cleanupCRD: true
upgrade CRD on chart upgrade, requires --disable-openapi-validation flag
to be passed during helm upgrade.
upgradeCRD: true initContainer: image: cuda repository: [readactedurl].com/ext.nvcr.io/nvidia version: 11.7.1-base-ubuntu20.04 imagePullPolicy: IfNotPresent tolerations:
mig: strategy: single
driver: enabled: true repository: [readactedurl].com/ext.nvcr.io/nvidia image: driver version: "515-signed" imagePullPolicy: IfNotPresent imagePullSecrets: [] rdma: enabled: false useHostMofed: false manager: image: k8s-driver-manager repository: [readactedurl].com/ext.nvcr.io/nvidia/cloud-native version: v0.4.2 imagePullPolicy: IfNotPresent env:
Private mirror repository configuration
repoConfig: configMapName: ""
custom ssl key/certificate configuration
certConfig: name: "cert-config"
vGPU licensing configuration
licensingConfig: configMapName: "" nlsEnabled: false
vGPU topology daemon configuration
virtualTopology: config: ""
kernel module configuration for NVIDIA driver
kernelModuleConfig: name: ""
configuration for controlling rolling update of NVIDIA driver DaemonSet pods
rollingUpdate:
maximum number of nodes to simultaneously apply pod updates on.
can be specified either as number or percentage of nodes. Default 1.
maxUnavailable: "1"
toolkit: enabled: true repository: [readactedurl].com/ext.nvcr.io/nvidia/k8s image: container-toolkit version: v1.11.0-ubuntu20.04 imagePullPolicy: IfNotPresent imagePullSecrets: [] env:
devicePlugin: enabled: true repository: [readactedurl].com/ext.nvcr.io/nvidia image: k8s-device-plugin version: v0.12.3-ubuntu20.04 imagePullPolicy: IfNotPresent imagePullSecrets: [] args: [] env:
Plugin configuration
config: default: |- version: v1 flags: migStrategy: "none" failOnInitError: true nvidiaDriverRoot: "/run/nvidia/driver" plugin: passDeviceSpecs: false deviceListStrategy: envvar deviceIDStrategy: uuid
standalone dcgm hostengine
dcgm:
disabled by default to use embedded nv-hostengine by exporter
enabled: false repository: [readactedurl].com/ext.nvcr.io/nvidia/cloud-native image: dcgm version: 3.0.4-1-ubuntu20.04 imagePullPolicy: IfNotPresent hostPort: 5555 args: [] env: [] resources: {}
dcgmExporter: enabled: true repository: [readactedurl].com/ext.nvcr.io/nvidia/k8s image: dcgm-exporter version: 3.0.4-3.0.0-ubuntu20.04 imagePullPolicy: IfNotPresent env:
gfd: enabled: true repository: [readactedurl].com/ext.nvcr.io/nvidia image: gpu-feature-discovery version: v0.7.0-ubuntu20.04 imagePullPolicy: IfNotPresent imagePullSecrets: [] env:
migManager: enabled: false repository: [readactedurl].com/ext.nvcr.io/nvidia/cloud-native image: k8s-mig-manager version: v0.5.0-ubuntu20.04 imagePullPolicy: IfNotPresent imagePullSecrets: [] env:
nodeStatusExporter: enabled: false repository: [readactedurl].com/ext.nvcr.io/nvidia/cloud-native image: gpu-operator-validator version: "v22.9.0" imagePullPolicy: IfNotPresent imagePullSecrets: [] resources: {}
Experimental and only deploys nvidia-fs driver on Ubuntu
gds: enabled: false repository: [readactedurl].com/ext.nvcr.io/nvidia/cloud-native image: nvidia-fs version: "515.43.04" imagePullPolicy: IfNotPresent imagePullSecrets: [] env: [] args: []
vgpuManager: enabled: false repository: "" image: vgpu-manager version: "" imagePullPolicy: IfNotPresent imagePullSecrets: [] env: [] resources: {} driverManager: image: k8s-driver-manager repository: [readactedurl].com/ext.nvcr.io/nvidia/cloud-native version: v0.4.2 imagePullPolicy: IfNotPresent env:
vgpuDeviceManager: enabled: false repository: [readactedurl].com/ext.nvcr.io/nvidia/cloud-native image: vgpu-device-manager version: "v0.2.0" imagePullPolicy: IfNotPresent imagePullSecrets: [] env: [] config: name: "" default: "default"
vfioManager: enabled: true repository: [readactedurl].com/ext.nvcr.io/nvidia image: cuda version: 11.7.1-base-ubuntu20.04 imagePullPolicy: IfNotPresent imagePullSecrets: [] env: [] resources: {} driverManager: image: k8s-driver-manager repository: [readactedurl].com/ext.nvcr.io/nvidia/cloud-native version: v0.4.2 imagePullPolicy: IfNotPresent env:
sandboxDevicePlugin: enabled: true repository: [readactedurl].com/ext.nvcr.io/nvidia image: kubevirt-gpu-device-plugin version: v1.2.1 imagePullPolicy: IfNotPresent imagePullSecrets: [] args: [] env: [] resources: {}
node-feature-discovery: worker: tolerations:
key: "nvidia.com/gpu" operator: "Equal" value: "present" effect: "NoSchedule"
config: sources: pci: deviceClassWhitelist:
master: extraLabelNs:
[X] kubernetes pods status:
kubectl get pods --all-namespaces
[X] kubernetes daemonset status:
kubectl get ds --all-namespaces
[X] If a pod/ds is in an error state or pending state
kubectl describe pod -n NAMESPACE POD_NAME
[X] If a pod/ds is in an error state or pending state
kubectl logs -n NAMESPACE POD_NAME
========== NVIDIA Software Installer ==========
Starting installation of NVIDIA driver branch 515 for Linux kernel version 5.4.0-125-generic
Stopping NVIDIA persistence daemon... Unloading NVIDIA driver kernel modules... Unmounting NVIDIA driver rootfs... Updating the package cache... E: Release file for http://us.archive.ubuntu.com/ubuntu/dists/focal-updates/InRelease is not valid yet (invalid for another 3h 58min 53s). Updates for this repository will not be applied. E: Release file for http://us.archive.ubuntu.com/ubuntu/dists/focal-security/InRelease is not valid yet (invalid for another 3h 57min 43s). Updates for this repository will not be applied. E: Release file for http://archive.ubuntu.com/ubuntu/dists/focal-updates/InRelease is not valid yet (invalid for another 3h 58min 52s). Updates for this repository will not be applied. E: Release file for http://archive.ubuntu.com/ubuntu/dists/focal-security/InRelease is not valid yet (invalid for another 3h 57min 42s). Updates for this repository will not be applied. Stopping NVIDIA persistence daemon... Unloading NVIDIA driver kernel modules... Unmounting NVIDIA driver rootfs...
Typical pre-driver/pre-toolkit config errors complaining about runtime class.. nothing out of the ordinary in this log stack