Closed tserong closed 4 years ago
I've labelled this DNM, as I haven't actually tested ceph.maintenance.upgrade yet on Nautilus, and I suspect nobody else has either. I mostly just want to get feedback on two things:
1) Is this check sensible?
2) What's up with this? https://github.com/SUSE/DeepSea/blob/079ad63d6593a9db6cf55a60987a2472aa18351d/srv/modules/runners/upgrade.py#L125-L126 It looks like the colocated_services check was never enabled, but the comment at https://github.com/SUSE/DeepSea/blob/079ad63d6593a9db6cf55a60987a2472aa18351d/srv/modules/runners/upgrade.py#L17-L26 seems to suggest it should be...
I think the check is fine, but should we have more specific error messages for multiple OS or ceph versions? A single minion that is down would also give the "upgrade is not supported".
Good point. Also I just realised that multiple Ceph versions are fine, provided they're within the same major release (imagine someone somehow gets half way through applying minor updates, then something breaks, then they need to run the update again). I'll rework this a bit.
While poking around further I discovered that f2e85228 changed the logic in /srv/salt/ceph/maintenance/upgrade/master/default.sls. Previously, the upgrade would only run if minions.ready, upgrade.check and validate.setup all returned True. Now, since that commit, if minions.ready returns False, the upgrade proceeds anyway, and does not call upgrade.check or validate.setup. @jschmid1 (or anyone else), can you recall the reason for this change?
OK, that's a better check and error messages.
Still need to figure out what's up with f2e8522. I don't think the behaviour introduced there is correct. Possibly the intention was to wait a while if some nodes are down in case they come back, but then to proceed with the upgrade anyway? If that's the case, surely it should still run upgrade.check and validate.setup?
Closing.. Reopen if needed
This re-allows ceph.maintenance.upgrade, provided all nodes are running Nautilus, so that function can be used to apply minor updates again.
Signed-off-by: Tim Serong tserong@suse.com
Checklist: