Closed runiq closed 11 months ago
Moving to the tracker as this is about Fedora CoreOS.
Hi,
Did just read your butane with one eye, but at first sight I notice that you don't use the with_mount_unit: true
option in filesystems
section
here is my config :
filesystems:
- path: /var
device: /dev/md/Raid
format: xfs
label: Var
wipe_filesystem: false
with_mount_unit: true
I use wipe_filesystem: false
because it runs on live boot / pxe
Hi, Did just read your butane with one eye, but at first sight I notice that you don't use the
with_mount_unit: true
option infilesystems
section
I'm using a dedicated mount unit (in the systemd:unit:
section) instead. If I use filesystems:
and raid:
entries like you suggest, the data on /var
is lost when I replace a disk in the RAID and then reprovision the system again. For me, this is an important consideration, because I'd like to run FCOS on a NAS. Since this will be a long-running system, reprovisioning will probably happen, so I'm trying to account for it.
I imagine part of the problem here is that systemd-tmpfiles is running before your /var/ is mounted so necessary files for some sevices don't get created. I wonder if you reboot the system so that systemd-tmpfiles runs again if some of the problems go away.
Do you have a boot log for the system that you could share?
I really think that you should use with_mount_unit: true
and if you want to reprovision your system you should use wipe_filesystem: false
A far as I can remember, when I set up my PXE booted server for the first time, I did wipe/format/create everything to have a clean RAID, and after that, I did change my butane/ignition to deal with a persistent RAID /var
here is my actual RAID config
storage:
raid:
- name: Raid
level: mirror
devices:
- /dev/disk/by-id/ata-WDC_WD10SPZX-80Z10T2_WD-WX41A49H9FT4
- /dev/disk/by-id/ata-WDC_WD10SPZX-80Z10T2_WD-WXL1A49KPYFD
options:
- --metadata=1.2
- --assume-clean
- --uuid=7ec8d4df:823fae52:c55d5e56:e773b281
--metadata=1.2 --uuid=7ec8d4df:823fae52:c55d5e56:e773b281 are infos that I get from some mdadm.conf or else ^^ (as far as I can remember ...)
--assume-clean let my server boot without full raid check/build/sync
@Nemric That was actually what I tried originally, and I would absolutely love it if it worked!
Unfortunately, once I replace a disk in the RAID, none of the options you suggest help. Upon reprovisioning, the RAID always gets recreated, Ignition (or rather, libblkid) doesn't recognize the filesystem on top anymore, and recreates it, with all data gone.
@dustymabe That appears to have helped with a lot of things, thanks! Unfortunately the core
user's home directory is still not there after a reboot. I haven't had much time to dig in deeper, I'll try and get a boot log and more info tomorrow.
Unfortunately the
core
user's home directory is still not there after a reboot.
That's probably because Ignition is the one who would create that directory (under /var/home
) and that runs in the initramfs.
Okay, I think I solved this. I'm not getting errors if I do the following dance right after Ignition runs:
var
partition (e.g. /mnt/var
)/run
with journalctl --relinquish-var
/sysroot/ostree/deploy/fedora-coreos/var
to /mnt/var
/var
(using var.mount
from the OP)/var
with journalctl --flush
This also works after a reboot, since the ExecCondition
check only creates the RAID if it cannot assemble it. The journal punt is required to keep logs of the first boot, otherwise it'll be 'lost' (or rather, written to someplace in /sysroot
, probably).
Here is an updated Butane config that incorporates all this:
repro.bu
In the end, all this amounts to is a check to see if we can assemble the RAID instead of unconditionally (re)creating it.
Okay, I think I solved this. I'm not getting errors if I do the following dance right after Ignition runs:
1. Set up a temporary mount point for my own `var` partition (e.g. `/mnt/var`) 2. Punt the systemd journal to `/run` with `journalctl --relinquish-var` 3. Move everything from `/sysroot/ostree/deploy/fedora-coreos/var` to `/mnt/var` 4. _Actually_ mount `/var` (using `var.mount` from the OP) 5. Move the journal back to `/var` with `journalctl --flush`
It is a bit complicated solution, isn't it?
I think probably this should be an issue against Ignition to not treat a degraded RAID device the same as no RAID device. E.g. it'd have to check if at least one of the devices in the list is a member of a RAID array with the wanted properties.
The Ignition issue is linked in the OP: https://github.com/coreos/ignition/issues/579
Bug
I want to use mirrored boot disks, but I want to put
/var
on a separate partition persistently. Because persistent RAID partitions don't seem to be supported out of the box, I'm following the advice in this comment to create the RAID+filesystem for/var
in my own Systemd unit. While this works in principle, it seems Ignition requires that my/var
filesystem be available while it is running. Is this somehow possible to achieve?Operating System Version
Fedora CoreOS 38.20231002.3.1
Ignition Version
3.4.0
Environment
Libvirt on Fedora Kinoite 38.20231103.0
Expected Behavior
/var
is populated correctlycore
user has a home directoryActual Behavior
It looks like my
/var
filesystem must be available during the Ignition run for the system to be functional. After Ignition runs its course (without apparent error), my/var
only seems to be 'half-populated,' so to speak. These are the issues I see:/var/lib
(presumably because thelib
directory does not exist/var/lib/systemd
core
user ends up in/
because its home directory does not existReproduction Steps
Use the following Butane file:
```yaml variant: "fcos" version: "1.5.0" boot_device: mirror: devices: - "/dev/disk/by-id/virtio-root-1" - "/dev/disk/by-id/virtio-root-2" systemd: units: - name: "serial-getty@ttyS0.service" dropins: - name: "autologin-core.conf" contents: | [Service] # Override Execstart in main unit ExecStart= # Add new Execstart with `-` prefix to ignore failure ExecStart=-/usr/sbin/agetty --autologin core --noclear %I $TERM TTYVTDisallocate=no - name: "create-var.service" enabled: true contents: | [Unit] Description=Create md-var RAID and var filesystem DefaultDependencies=no # We 'slot' this in between the component devices of the RAID volume # and the /var mount: After=local-fs-pre.target After=dev-disk-by\x2dpartlabel-var\x2d1.device After=dev-disk-by\x2dpartlabel-var\x2d2.device Before=systemd-fsck@dev-md-md\x2dvar.service Before=var.mount # The RAID itself and the filesystem on it should NOT yet exist for this # unit to run ConditionPathExists=!/dev/md/md-var ConditionPathExists=!/dev/disk/by-label/var [Service] Type=oneshot RemainAfterExit=yes ExecStart=/usr/bin/bash -c 'echo yes | /usr/sbin/mdadm --create md-var \ --verbose \ --homehost=any \ --level=raid1 \ --raid-devices=2 \ /dev/disk/by-partlabel/var-1 \ /dev/disk/by-partlabel/var-2' ExecStart=/usr/bin/bash -c 'ls -l /var 1>&2' # mkfs.xfs fails if there is already a filesystem present on the device ExecStart=/usr/sbin/mkfs.xfs -Lvar /dev/md/md-var [Install] WantedBy=dev-md-md\x2dvar.device - name: "var.mount" enabled: false contents: | [Unit] Requires=systemd-fsck@dev-md-md\x2dvar.service After=systemd-fsck@dev-md-md\x2dvar.service [Mount] What=/dev/md/md-var Where=/var Type=xfs Options=defaults,strictatime,lazytime,prjquota [Install] RequiredBy=local-fs.target storage: disks: - device: "/dev/disk/by-id/virtio-root-1" partitions: - label: "esp-1" wipe_partition_entry: true - label: "boot-1" wipe_partition_entry: true - label: "root-1" size_mib: 8400 wipe_partition_entry: true - label: "var-1" wipe_partition_entry: false type_guid: "A19D880F-05FC-4D3B-A006-743F0F84911E" - device: "/dev/disk/by-id/virtio-root-2" partitions: - label: "esp-2" wipe_partition_entry: true - label: "boot-2" wipe_partition_entry: true - label: "root-2" size_mib: 8400 wipe_partition_entry: true - label: "var-2" wipe_partition_entry: false type_guid: "A19D880F-05FC-4D3B-A006-743F0F84911E" filesystems: - device: "/dev/md/md-boot" wipe_filesystem: true - device: "/dev/md/md-root" wipe_filesystem: true format: "xfs" ```repro.bu
Compile it to Ignition:
```json { "ignition": { "version": "3.4.0" }, "storage": { "disks": [ { "device": "/dev/disk/by-id/virtio-root-1", "partitions": [ { "label": "bios-1", "sizeMiB": 1, "typeGuid": "21686148-6449-6E6F-744E-656564454649" }, { "label": "esp-1", "sizeMiB": 127, "typeGuid": "C12A7328-F81F-11D2-BA4B-00A0C93EC93B", "wipePartitionEntry": true }, { "label": "boot-1", "sizeMiB": 384, "wipePartitionEntry": true }, { "label": "root-1", "sizeMiB": 8400, "wipePartitionEntry": true }, { "label": "var-1", "typeGuid": "A19D880F-05FC-4D3B-A006-743F0F84911E", "wipePartitionEntry": false } ], "wipeTable": true }, { "device": "/dev/disk/by-id/virtio-root-2", "partitions": [ { "label": "bios-2", "sizeMiB": 1, "typeGuid": "21686148-6449-6E6F-744E-656564454649" }, { "label": "esp-2", "sizeMiB": 127, "typeGuid": "C12A7328-F81F-11D2-BA4B-00A0C93EC93B", "wipePartitionEntry": true }, { "label": "boot-2", "sizeMiB": 384, "wipePartitionEntry": true }, { "label": "root-2", "sizeMiB": 8400, "wipePartitionEntry": true }, { "label": "var-2", "typeGuid": "A19D880F-05FC-4D3B-A006-743F0F84911E", "wipePartitionEntry": false } ], "wipeTable": true } ], "filesystems": [ { "device": "/dev/disk/by-partlabel/esp-1", "format": "vfat", "label": "esp-1", "wipeFilesystem": true }, { "device": "/dev/disk/by-partlabel/esp-2", "format": "vfat", "label": "esp-2", "wipeFilesystem": true }, { "device": "/dev/md/md-boot", "format": "ext4", "label": "boot", "wipeFilesystem": true }, { "device": "/dev/md/md-root", "format": "xfs", "label": "root", "wipeFilesystem": true } ], "raid": [ { "devices": [ "/dev/disk/by-partlabel/boot-1", "/dev/disk/by-partlabel/boot-2" ], "level": "raid1", "name": "md-boot", "options": [ "--metadata=1.0" ] }, { "devices": [ "/dev/disk/by-partlabel/root-1", "/dev/disk/by-partlabel/root-2" ], "level": "raid1", "name": "md-root" } ] }, "systemd": { "units": [ { "dropins": [ { "contents": "[Service]\n# Override Execstart in main unit\nExecStart=\n# Add new Execstart with `-` prefix to ignore failure\nExecStart=-/usr/sbin/agetty --autologin core --noclear %I $TERM\nTTYVTDisallocate=no\n", "name": "autologin-core.conf" } ], "name": "serial-getty@ttyS0.service" }, { "contents": "[Unit]\nDescription=Create md-var RAID and var filesystem\nDefaultDependencies=no\n\n# We 'slot' this in between the component devices of the RAID volume\n# and the /var mount:\nAfter=local-fs-pre.target\nAfter=dev-disk-by\\x2dpartlabel-var\\x2d1.device\nAfter=dev-disk-by\\x2dpartlabel-var\\x2d2.device\nBefore=systemd-fsck@dev-md-md\\x2dvar.service\nBefore=var.mount\n\n# The RAID itself and the filesystem on it should NOT yet exist for this\n# unit to run\nConditionPathExists=!/dev/md/md-var\nConditionPathExists=!/dev/disk/by-label/var\n\n[Service]\nType=oneshot\nRemainAfterExit=yes\nExecStart=/usr/bin/bash -c 'echo yes | /usr/sbin/mdadm --create md-var \\\n\t--verbose \\\n\t--homehost=any \\\n\t--level=raid1 \\\n\t--raid-devices=2 \\\n\t/dev/disk/by-partlabel/var-1 \\\n\t/dev/disk/by-partlabel/var-2'\nExecStart=/usr/bin/bash -c 'ls -l /var 1\u003e\u00262'\n# mkfs.xfs fails if there is already a filesystem present on the device\nExecStart=/usr/sbin/mkfs.xfs -Lvar /dev/md/md-var\n\n[Install]\nWantedBy=dev-md-md\\x2dvar.device\n", "enabled": true, "name": "create-var.service" }, { "contents": "[Unit]\nRequires=systemd-fsck@dev-md-md\\x2dvar.service\nAfter=systemd-fsck@dev-md-md\\x2dvar.service\n\n[Mount]\nWhat=/dev/md/md-var\nWhere=/var\nType=xfs\nOptions=defaults,strictatime,lazytime,prjquota\n\n[Install]\nRequiredBy=local-fs.target\n", "enabled": false, "name": "var.mount" } ] } } ```repro.ign
Create the VM using libvirt (adapted from the documentation):