coreos / fedora-coreos-docs

Documentation for Fedora CoreOS
https://docs.fedoraproject.org/en-US/fedora-coreos/
Other
50 stars 123 forks source link

Document automated reprovisioning for bare metal #299

Open bgilbert opened 3 years ago

bgilbert commented 3 years ago

We want to encourage users to reprovision their bare-metal nodes when their config changes, but we don't document how to do it. In the long run, https://github.com/coreos/fedora-coreos-tracker/issues/399 might be the right approach, but in the short term, other mechanisms are required. This is arguably part of https://github.com/coreos/fedora-coreos-docs/issues/117, but I'm filing it separately to distinguish the higher-level workflow issue from the nuts-and-bolts topics in #117.

There are a few possible reprovisioning flows:

Whichever flows we decide to recommend, we should provide step-by-step instructions.

bgilbert commented 3 years ago

We should also document how to preserve data volumes when reprovisioning. The traditional model is to do this with Ignition, but there are some quirks and they're not documented in one place:

This procedure assumes the partition layout in the new Ignition config matches the old one. If not, Ignition may clobber the data volumes. It also assumes that the original layout left at least the documented 8 GiB for the root filesystem; if not, reprovisioning with a newer OS version may clobber the first data volume on the boot disk due to coreos/fedora-coreos-tracker#586.

For additional safety, coreos-installer provides a partition-saving mechanism (--save-part{label,index} / coreos.inst.save_part{label,index}) which explicitly preserves specified partitions during a reinstall. This is safe against storage.disks desynchronization (since Ignition won't clobber the saved partitions unless told to) and against coreos/fedora-coreos-tracker#586 (since coreos-installer won't overwrite a saved partition). Partition saving uses the same Ignition configs as in the traditional model; for fresh installs, coreos-installer will ignore requests to save partitions that don't exist. Downsides of this approach are that the installer needs to be explicitly told about data partitions that are expected to exist on the boot disk, and that fresh installs may need to explicitly wipe the boot disk to avoid preserving old data.

stereobutter commented 2 years ago

We are currently working on a POC of using FCOS for IIoT in industrial automation. The device provisioning is currently done via iso images with embedded ignition config and we are quite happy with this, as this is much easier to setup than a working PXE environment.

This is doubly true for reprovisioning devices in the field where we don't control the network at all and rather (a) interactively reprovision a machine by executing some flavor of coreos-installer install ... over ssh or (b) prepare a magic usb stick that does the same.

From what I gather this is already possible using coreos-installer as is (using the partition saving mechanism etc.)? In any case an explicit api (maybe as part of coreos-installer) that resets/reprovisions a machine (with some options to control what to keep and what to reset/override/append) would be a welcome addition.

bgilbert commented 2 years ago

@stereobutter There's some POC work happening in https://github.com/coreos/coreos-installer/pull/712. For the "magic USB stick" case, you may also be interested in the new coreos-installer iso customize command, which should make such images easier to create.

runiq commented 10 months ago

storage.raid: probably safest to avoid, as Ignition always recreates the RAID. Recreating a RAID with the same parameters might preserve data, but I don't know if this has been tested.

It doesn't preserve data if a disk has been replaced, I tested.