kubernetes-sigs / azurelustre-csi-driver

Apache License 2.0
12 stars 22 forks source link

Uninstall existing driver if upgrade fails #149

Closed dabradley closed 10 months ago

dabradley commented 11 months ago

We currently don't have a supported in-place upgrade path for users because of the way our packages are configured. There are a number of ways that the installed packages can get into bad states that apt (and even dpkg directly) refuses to work around without manual intervention.

To avoid this behavior, this commit adds limited retries to the package installation step that, in the instance of a failed client install, will attempt to unload any running Lustre modules, remove the installed packages, and retry the installation.

In addition, when the pod is terminated, it will attempt to unload the Lustre modules, but will not fail if it cannot unload them (such as when there are existing pods running which have active Lustre mounts).

If it cannot unload the existing modules, it will log a warning that the new version will likely not be running, but will not fail. Additionally, if it cannot install the desired packages, it will log a warning that it could not install correctly, but will continue to attempt to start the driver.

What type of PR is this? /kind bug

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Requirements:

Special notes for your reviewer: Tested this with:

Release note:

none
dabradley commented 10 months ago

@joe-atzinger I'd appreciate if you could get a chance to look at this one. This would be a nice quality of life improvement since customers seem to get stuck with bad upgrade states a lot

k8s-ci-robot commented 10 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dabradley, joe-atzinger

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/azurelustre-csi-driver/blob/main/OWNERS)~~ [joe-atzinger] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment