coreos / fedora-coreos-tracker

Issue tracker for Fedora CoreOS
https://fedoraproject.org/coreos/
262 stars 59 forks source link

Add a data persistence test #1763

Open jlebon opened 1 month ago

jlebon commented 1 month ago

We claim to support data persistence across reprovisions:

We should have a test for it. This would've caught https://github.com/coreos/fedora-coreos-tracker/issues/1745.

Nemric commented 1 month ago

Hi Jonathan, I did read carefully this link https://docs.fedoraproject.org/en-US/fedora-coreos/live-booting/#_using_persistent_state and

Avoid writing persistent data to RAID volumes, since Ignition cannot reuse those.

I use raid like this :

variant: fcos
version: 1.5.0

storage:
  raid:
    - name: Raid
      level: mirror
      devices:
        - /dev/disk/by-id/ata-WDC_WD10SPZX-80Z10T2_WD-WX41A49H9FT4
        - /dev/disk/by-id/ata-WDC_WD10SPZX-80Z10T2_WD-WXL1A49KPYFD
      options:
        - --metadata=1.2
        - --assume-clean
        - --uuid=7ec8d4df:823fae52:c55d5e56:e773b281

  filesystems:
    - path: /var
      device: /dev/md/Raid
      format: xfs
      label: Var
      wipe_filesystem: false
      with_mount_unit: true

I now only use pxe booted nodes (5 nodes) all of them have persistent storage without encryption, and one of them have raid mirroring drives, all nodes are re-provisioned at least every 2 weeks (CoreOS updates) and it works well since ... let's say ... close to first stable CoreOS release

I didn't use these arguments at first use of this butane file, but for reboot only :

the --assume-clean setting avoid rebuild of raid array on reboot, fortunately, a check is done weekly thanks to a raid-check.service and raid-check.timer on sunday morning

when using mdadm --detail --scan after first boot I did get array uuid so I did set it in butane to avoid conflicts (can't say if I'm right :D ) : --uuid=7ec8d4df:823fae52:c55d5e56:e773b281

I can't remember if I really use `--metadata=1.2' at first boot :/

Of course I have no spare drives and I don't know what will happen when actual ones will die ! at the same time ?! Same provider, same buy date, same reference, ...

so I think we can use raid device with live environment but at our own risks ;)

jlebon commented 1 month ago

I think that doc item is talking about https://github.com/coreos/ignition/issues/579. The workaround above probably belongs there instead.

so I think we can use raid device with live environment but at our own risks ;)

Yeah, agreed.