This addresses two current known issues to snap refresh if nvidia support is in use.
The previous CDI config is never valid for a newer revision, as some paths contain the snap revision.
This PR ensures the previous config is always purged
If the nvidia-container-toolkit service fails to start properly for some reason, it prevents the snap refresh from completing, even though dockerd would remain functional for non nvidia images use cases.
This PR enables nvidia-container-toolkit to fail setting up the CDI and dockerd config, and print a warning instead.
It also ensures that dockerd config is only attempted if CDI config is generated.
CDI config can fail on refresh due to: ERROR_LIB_RM_VERSION_MISMATCH [ nvidia kernel module and user space libs do not match ].
This addresses two current known issues to snap refresh if nvidia support is in use.
The previous CDI config is never valid for a newer revision, as some paths contain the snap revision.
If the
nvidia-container-toolkit
service fails to start properly for some reason, it prevents the snap refresh from completing, even though dockerd would remain functional for non nvidia images use cases.nvidia-container-toolkit
to fail setting up the CDI and dockerd config, and print a warning instead.ERROR_LIB_RM_VERSION_MISMATCH
[ nvidia kernel module and user space libs do not match ].