Open allisonkarlitskaya opened 1 day ago
The obvious advantage of this: we could stop thinking about all of these artifact/hidden-composefs-meta-layer hacks. The boot resources would just be in /boot
in BLS type 1 or type 2.
In current bootc standards we don't have anything in /boot
- it's the job of the thing deploying the image to copy any data necessary in /boot
into the target (a mix of bootc (really libostree) for the kernel, and bootupd for other stuff) and I think this is right. This stuff may be on separate partitions and most importantly if we want to have more than one bootable container installed (and we clearly do) then what's in /boot
in the container can't be a source of truth. systemd-boot for example already does this right, the binaries live in /usr
and bootctl install
copies them and upgrades them etc.
If we're doing that anyway, then to this non-normative boot-only composefs we could make another tweak: remove /boot from the image before we create it, replacing it with an empty directory. That works around our UKI hash recursion issue without getting into weird layer hiding tricks.
I am not quite following; we need to "physically" ship the kernel in the container image so it's downloaded to the client system right?
Isn't this issue just overall a duplicate of https://github.com/containers/composefs-rs/issues/21 ?
This issue is definitely highly related to #21 but sort of a different direction. I mostly filed it because I wanted to write it down as I was flying out the door so I didn't forget :)
@cgwalters mentioned something in a meeting today that I wasn't properly thinking about before.
Long story short: by doing the selinux label rewriting which we need to do as installing an unsealed/modified container image, we're effectively already creating a non-normative form of the container image that is only used for booting the system, and not from inside of containers running on the system.
If we're doing that anyway, then to this non-normative boot-only composefs we could make another tweak: remove /boot from the image before we create it, replacing it with an empty directory. That works around our UKI hash recursion issue without getting into weird layer hiding tricks.
Of course, if we're talking about sealed images, then probably we already precomputed our selinux labels already and wrote them into the tar stream and then wrote the "one true" composefs fsverity digest into a label on the image. In that case our hands are substantially more tied. But maybe this idea of splitting out
/boot
makes sense anyway. We already need to handle/boot
resources specially on install, and in the UKI case we have a UEFI signature on the kernel binary (which, in turn, is also signed as part of the overall container image) which becomes the real trust chain for the booted system anyway (ie: we don't check the OCI label on boot). So maybe having a composefs with/boot
stripped from it still makes sense even for fully-sealed images.