ostreedev / ostree

Operating system and container binary deployment and upgrades
https://ostreedev.github.io/ostree/
Other
1.31k stars 300 forks source link

Request a kernel commandline for `sysroot.readonly` #3315

Closed ruihe774 closed 1 month ago

ruihe774 commented 2 months ago

I request a kernel commandline to override sysroot.readonly, just like the existing ostree.prepare-root.composefs.

My Fedora Silverblue 40 becomes unbootable after upgrading to Fedora Silverblue 41. Because my disk is not writable during boot (I'm using TCG OPAL, FWIW), I have to use ro in kernel commandline. sysroot.readonly=true requires rw, so in f40 I simply set it to false in /ostree/repo/config. However, f41 bakes sysroot.readonly=true into /usr/lib/ostree/prepare-root.conf in the initramfs, which has precedence over /ostree/repo/config and makes my system unbootable. Because the initramfs in Silverblue is not generated in my local machine but provided by ostree repo, I have no way to modify prepare-root.conf in it.

Because I cannot change the initramfs, I think the only way is to provide a kernel commandline to override it. I'd appreciate it if it could be taken into consideration and be fixed before f41 releases.

ruihe774 commented 2 months ago

Another solution is to allow sysroot.readonly=true with ro and in this case postpone the bind mount of /etc and /var to ostree-remount.

cgwalters commented 2 months ago

Because my disk is not writable during boot (I'm using TCG OPAL, FWIW),

Can you explain a bit more? I only skimmed the spec and such but in your use case, how is the key provided? Do you enter a passphrase during boot, like LUKS?

cgwalters commented 2 months ago

Another solution is to allow sysroot.readonly=true with ro and in this case postpone the bind mount of /etc and /var to ostree-remount.

No, we learned that's just full of race conditions. The rootfs needs to be fully set up in the initramfs before we switch to it.

In your case is the drive really only set to writable mode during the real root boot, it isn't done in the initramfs? If not, can it be?

ruihe774 commented 2 months ago

Can you explain a bit more? I only skimmed the spec and such but in your use case, how is the key provided? Do you enter a passphrase during boot, like LUKS?

In your case is the drive really only set to writable mode during the real root boot, it isn't done in the initramfs? If not, can it be?

In my disk, I have set up two ranges. One is write-locked (i.e. readonly during boot) for / and one is both read-locked and write-locked (i.e. not accessible during boot) for /var. In TCG OPAL, the encryption is done by the disk controller and is transparent to the host. And the ability to set up a range to be only write-locked is a unique feature. (LUKS etc cannot do this.) I put / in the write-locked range and do not unlock it during initramfs.

I use sedutil to unlock my disk (setutil-cli --setLockingRange n rw passwd dev). (I do not use its preboot image because I also need security boot.) sedutil is not included in initramfs. I write my custom service (e.g. unlock-sed.service) to ask password and call sedutil before local-fs-pre.target. The service runs before ostree-remount.service and mounting of fstab; so the mounting and remounting work fine with the unlocked disk. (I also write my custom service to bind mount /etc, which is supposed to be done by ostree-prepare-root in initramfs when sysroot.readonly=true. Despite "full of race conditions", it at least works.)

I know there are things like LUKS and cryptosetup. But these solutions have a problem: the initramfs itself is not encrypted and can be modified at rest. It is unacceptable, IMO.

I don't think my scenario is rare. People use various disk setting up schemes. Many of them are not included in initramfs. And in Fedora Atomic Desktops and similar distros, the initramfs is not customizable. So we have to set up the disk after initramfs.

jlebon commented 2 months ago

And in Fedora Atomic Desktops and similar distros, the initramfs is not customizable.

Not exactly. Since you're using Silverblue, I'll mention two tools to customize the initramfs: (1) rpm-ostree initramfs for rebuilding the initramfs and (2) rpm-ostree initramfs-etc for only layering some files. The latter could be particularly useful here, but it would require ostree-prepare-root to support an /etc location as well to override the /usr one, which doesn't seem unreasonable to have anyway.

So we have to set up the disk after initramfs.

Hmm, but the whole point of the initramfs is to set up the rootfs, so this seems to be working against the design. Are you concerned with e.g. initramfs processes writing to the rootfs?

ruihe774 commented 2 months ago

rpm-ostree initramfs-etc for only layering some files. The latter could be particularly useful here, but it would require ostree-prepare-root to support an /etc location as well to override the /usr one, which doesn't seem unreasonable to have anyway.

Sounds doable.

Nevertheless, giving there is already a cmdline ostree.prepare-root.composefs, I don't think it is unacceptable to add a similar one.

Hmm, but the whole point of the initramfs is to set up the rootfs, so this seems to be working against the design. Are you concerned with e.g. initramfs processes writing to the rootfs?

I don't think so. The purpose of the initramfs is to set up the rootfs to be readable. It is not uncommon to have ro in kernel cmdline and remount the rootfs to be rw after initramfs, and it is the reason why things like systemd-remount-fs.service exist. Initramfs processes are not allowed to write to the rootfs by design, IMO.

ruihe774 commented 2 months ago

Another solution is to just ignore sysroot.readonly and do nothing if the rootfs is already readonly; in this case, it's users' duty to make /etc and /var rw again: the purpose of cmdline ro is to make entire rootfs readonly, including /etc and /var if they reside on it.

ruihe774 commented 1 month ago

I wonder if there are any updates.

Another solution is to just ignore sysroot.readonly and do nothing if the rootfs is already readonly

FWIW now I think this is a better solution.

cgwalters commented 1 month ago

Another solution is to just ignore sysroot.readonly and do nothing if the rootfs is already readonly; in this case, it's users' duty to make /etc and /var rw again: the purpose of cmdline ro is to make entire rootfs readonly, including /etc and /var if they reside on it.

I agree that if ro is present ostree should not attempt to override it in theory. In practice...a whole lot of things start to fail IME when trying to boot with /etc and /var actually readonly, and most of these cases actually want something like "boot with a transient overlayfs" for those directories.

So roughly then are you suggesting

diff --git a/src/switchroot/ostree-prepare-root.c b/src/switchroot/ostree-prepare-root.c
index 7754673e..a002ad6e 100644
--- a/src/switchroot/ostree-prepare-root.c
+++ b/src/switchroot/ostree-prepare-root.c
@@ -501,13 +501,6 @@ main (int argc, char *argv[])
   g_variant_builder_add (&metadata_builder, "{sv}", OTCORE_RUN_BOOTED_KEY_ROOT_TRANSIENT,
                          g_variant_new_boolean (root_transient));

-  /* This will result in a system with /sysroot read-only. Thus, two additional
-   * writable bind-mounts (for /etc and /var) are required later on. */
-  if (sysroot_readonly)
-    {
-      if (!sysroot_currently_writable)
-        errx (EXIT_FAILURE, OTCORE_SYSROOT_NOT_WRITEABLE, root_arg);
-    }
   /* Pass on the state for use by ostree-prepare-root */
   g_variant_builder_add (&metadata_builder, "{sv}", OTCORE_RUN_BOOTED_KEY_SYSROOT_RO,
                          g_variant_new_boolean (sysroot_readonly));

? I think this would just bring us back to a world where ostree-remount would try to remount /etc and /var which was racy. But...it may be possible to rework the unit ordering to fix that. It may want a new systemd target defined for this.

ruihe774 commented 1 month ago

So roughly then are you suggesting

Yes.

I think this would just bring us back to a world where ostree-remount would try to remount /etc and /var which was racy.

I don't think so.

https://github.com/ostreedev/ostree/blob/9ca8b4604d4955772276fcc5fb7a7fa2a4e532a6/src/switchroot/ostree-remount.c#L220-L226

ostree-remount just does nothing if / is readonly.

a whole lot of things start to fail IME when trying to boot with /etc and /var actually readonly

I don't think /etc being readonly is a problem: many stateless systems just do so.

I believe the failures you encountered were caused by readonly /var; nevertheless, in ro case, /var is usually another partition/disk and is mounted using fstab.

cgwalters commented 1 month ago

Yes.

OK. Want to adapt your existing PR to do that instead?

ruihe774 commented 1 month ago

Yes.

OK. Want to adapt your existing PR to do that instead?

Done.