Closed dabradley closed 10 months ago
@joe-atzinger I'd appreciate if you could get a chance to look at this one. This would be a nice quality of life improvement since customers seem to get stuck with bad upgrade states a lot
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: dabradley, joe-atzinger
The full list of commands accepted by this bot can be found here.
The pull request process is described here
We currently don't have a supported in-place upgrade path for users because of the way our packages are configured. There are a number of ways that the installed packages can get into bad states that apt (and even dpkg directly) refuses to work around without manual intervention.
To avoid this behavior, this commit adds limited retries to the package installation step that, in the instance of a failed client install, will attempt to unload any running Lustre modules, remove the installed packages, and retry the installation.
In addition, when the pod is terminated, it will attempt to unload the Lustre modules, but will not fail if it cannot unload them (such as when there are existing pods running which have active Lustre mounts).
If it cannot unload the existing modules, it will log a warning that the new version will likely not be running, but will not fail. Additionally, if it cannot install the desired packages, it will log a warning that it could not install correctly, but will continue to attempt to start the driver.
What type of PR is this? /kind bug
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Requirements:
Special notes for your reviewer: Tested this with:
Release note: