rancher / elemental-operator

The Elemental operator is responsible for managing the OS versions and maintaining a machine inventory to assist with edge or baremetal installations.
Apache License 2.0
43 stars 17 forks source link

Provide some sort of `cloud-config` for upgrades #849

Open davidcassany opened 1 month ago

davidcassany commented 1 month ago

The idea would be having a mechanism to define some cloud-config for the upgrade. This is relevant in cases where the cloud-config initially provided in the MachineRegistration is no longer valid or incomplete after the upgrade. In such cases it would be handy having the availability to apply some additional cloud-config that is honored on next reboot after the upgrade or even applied during the upgrade via upgrade hooks.

Consider something like setting a cloud-config in ManagedOSImage resource that is converted in to a yaml file that gets dropped in node's /oem partition before starting the elemental upgrade call, this way even upgrade hooks could be an option.

anmazzotti commented 1 month ago

I wondered about this use case too, so I'll drop some input.

My understanding was that the new config should not override the existing ones (for example from the registration). I find this normal as the registration config has probably "static" things configured that you most likely do not want to change, for example users definition. If anyone wants to replace that, I would recommend a machine reset. It's the safest option.

We already have the SeedImage.spec.cloud-config to be included in the system after upgrade too, in order to add new configs.

What I think the request wanted was to execute scripts or apply OS configuration changes during the upgrade process. My idea was to add something like SeedImage.spec.evaluateCloudConfig, so that a config can be passed to be evaluated during upgrade. If using yip syntax the user can specify the stage, but if using cloud-init syntax then we should convert to either after-upgrade or before-upgrade by default I'd say, there is no "during". I would opt for the latter, or we could make it toggable with an additional field.

Still this enters the configuration changes scenario, so my argument as usual is to use the reset flow instead.

Edit: Nevermind this entire comment. The SeedImage.spec.cloud-config is already ran at image boot, so it does already what I described.

davidcassany commented 1 month ago

We already have the SeedImage.spec.cloud-config to be included in the system after upgrade too, in order to add new configs.

Indeed, SeedImage.spec.cloud-config is already executed at boot. However seedImage has nothing to do with upgrades, it is only meaningful for installations and new deployments. The problem is we are missing a way modify configuration as part of an upgrade, which could be also read as apply an upgrade process if you need to modify a configuration for whatever reason.

Also note this feature is also leading us to the need of snapshotting /oem (and probably other persistent areas) somehow (otherwise rolling back is just not really meaningful). Also this is something to consider in order to better align with Micro, their approach is to use persistent overlays that matter.