Closed dustymabe closed 1 month ago
Implementing this we think will mean issues like https://github.com/openshift/os/issues/1504 will go away.
This will fix https://github.com/coreos/coreos-assembler/issues/3801
so coreos.unique.boot.ignition.failure
is failing here in ci/prow/rhcos. I can also reproduce this locally.
It looks like the Ensure Unique 'boot' Filesystem Label
in the console is happening before Ignition even runs, but the coreos.unique.boot.ignition.failure
adds a boot
labeled filesystem using Ignition.
Is it really required for this unit to run after Ignition is complete? I guess so since that's a test case we want to cover, but we'll probably have to strengthen the unit dependencies.
ahh. interestingly enough we have two units that check if boot
is unique, but they both have very similar descriptions so the log messages are hard to distinguish to the untrained eye.
so it looks like maybe the coreos-ignition-unique-boot.service
somehow isn't doing it's job here.
ok I think I found the real error earlier up in the log:
[^[[0;32m OK ^[[0m] Finished ^[[0;1;39mGenerate New UUID For Boot Disk GPT^[[0m.^M
[ 4.126289] systemd[1]: Finished Generate New UUID For Boot Disk GPT.^M
[ 4.137089] ignition-ostree-transposefs[908]: Moving bootfs to RAM...^M
Starting ^[[0;1;39mIgnition OSTree: Save Partitions^[[0m...[ 4.137822] systemd[1]: Starting Ignition OSTree: Save Partitions...^M
^M
[ 4.140833] ignition-ostree-transposefs[908]: Mounting /dev/disk/by-label/boot ro (/dev/vdb3) to /var/tmp/mnt^M
Starting ^[[0;1;39mIgnition OSTree: …rate Filesystem UUID (boot)^[[0m...[ 4.144472] systemd[1]: Starting Ignition OSTree: Regenerate Filesystem UUID (boot)...^M
^M
[ 4.155819] ignition-ostree-firstboot-uuid[918]: e2fsck 1.46.5 (30-Dec-2021)^M
[ 4.155947] ignition-ostree-firstboot-uuid[918]: /dev/disk/by-label/boot is in use.^M
[ 4.155973] ignition-ostree-firstboot-uuid[918]: e2fsck: Cannot continue, aborting.^M
[ 4.158304] EXT4-fs (vdb3): mounted filesystem 96d15588-3596-4b3c-adca-a2ff7279ea63 ro with ordered data mode. Quota mode: none.^M
[^[[0;1;31mFAILED^[[0m] Failed to start ^[[0;1;39mIgnition O…nerate Filesystem UUID (boot)^[[0m.^M
I think the CI issue is roughly:
metadata_csum_seed
is still off by default in el9metadata_csum_seed
feature so we'd have to serialize things more to ensure there's no mounts happening at the same timeBut also, this would make us enter a completely different path in ignition-ostree-firstboot-uuid.
I think it'd be safer to still enable the feature? If we just hard enable it and then we can simplify ignition-ostree-firstboot-uuid.
CI is fixed and all comments should be addressed now.
With this we now use a buildroot that is derived from the OCI container that was built by the pipeline. This allows us to use the exact same versions of software from the payload we built when we construct the images that we will ship, which will be better for us over time.
The benefits of this are immediately apparent in this commit as we are able to drop configuration that tries to set feature flags for our ext4 filesystems based on what we think are the current defaults in RHEL.
For now we aren't able to do this with FCOS because FCOS doesn't have python in it. This should be OK for now because COSA is almost always based on the latest version of Fedora. Though one benefit we would have if we did switch to doing this for FCOS is that we would test newer versions of "build tools" from
rawhide
alongside therawhide
pipeline builds that we do.