OE4T / tegra-demo-distro

Reference/demonstration distro for meta-tegra
MIT License
80 stars 74 forks source link

Mender delta updates fail to work due to systemd-machine-id-setup service #214

Closed vinotion closed 7 months ago

vinotion commented 2 years ago

As promised in this discussion; here's a suggestion for making Mender delta updates work.

At the beginning of the boot process, systemd-machine-id-setup will change the machine ID in /etc/machine-id. If the root filesystem is writable at this moment in time, this will effectively change the root filesystem, and thereby disabling the ability to use Mender delta updates (which assumes an unmodified root filesystem to allow for incremental updates).

By making sure the kernel already mounts the root filesystem as read-only, this change of /etc/machine-id is prevented. This can be done as follows:

diff --git a/layers/meta-tegrademo/conf/distro/tegrademo-mender.conf b/layers/meta-tegrademo/conf/distro/tegrademo-mender.conf
index 0acf1d7..a842607 100644
--- a/layers/meta-tegrademo/conf/distro/tegrademo-mender.conf
+++ b/layers/meta-tegrademo/conf/distro/tegrademo-mender.conf
@@ -41,3 +41,6 @@ PREFERRED_PROVIDER_virtual/bootloader:tegra186 = "cboot-t18x"

 # Use u-boot by default on the TX2 COT when using the FIT image
 PREFERRED_PROVIDER_virtual/bootloader:jetson-tx2-devkit-cot = "u-boot-tegra"
+
+# Force root FS to be read-only at early boot, to make sure Mender delta-updates will work.
+KERNEL_ARGS += "ro"

Because we also want various runtime-configurable persistent system configuration changes, we have an overlay mount for /etc.

Aside from this minor issue, there seems to be a general compatibility issue of systemd with read-only root filesystems and overlayed /etc mounts. Some unconfigurable aspects of systemd simply assume that the root filesystem is writable at early boot time (i.e. before the /etc overlay is mounted). This leads to interesting problems like:

We did not find a suitable solution for working with systemd and (partly) persistent changes in /etc.

dwalkes commented 2 years ago

Thanks @vinotion!

This relates to https://github.com/OE4T/tegra-demo-distro/discussions/198 as well as https://github.com/OE4T/meta-mender-community/pull/8#issuecomment-730504059 and https://github.com/OE4T/meta-tegra/pull/527. It's also related to https://github.com/systemd/systemd/issues/14131 Cross referencing to setup links to these.

At the beginning of the boot process, systemd-machine-id-setup will change the machine ID in /etc/machine-id.

The change at https://github.com/OE4T/meta-tegra/pull/527/files#diff-6952bcad754469ed729bf94101a17d36ff760011bb19601d5e3b0f50d74a546f (see https://github.com/OE4T/meta-tegra/pull/527/files#diff-6952bcad754469ed729bf94101a17d36ff760011bb19601d5e3b0f50d74a546f for instance) puts the systemd.machine_id on the command line. See this example from my booting xavier-nx-devkit-emmc image running kirkstone at https://github.com/OE4T/tegra-demo-distro/commit/f8b9c379cf1c3d1d69e135307ac90ce82b15ec44 with source setup-env --machine jetson-xavier-nx-devkit-emmc --distro tegrademo-mender and bitbake demo-image-base

root@jetson-xavier-nx-devkit-emmc:~# cat /proc/cmdline
console=ttyTCU0,115200 console=tty0 fbcon=map:0 video=tegrafb earlycon=tegra_comb_uart,mmio32,0x0c168000 gpt rootfs.slot_suffix= usbcore.old_scheme_first=1 tegraid=19.1.2.0.0 maxcpus=6 boot.slot_suffix=_b boot.ratchetvalues=0.4.2 vpr_resize sdhci_tegra.en_boot_part_access=1 systemd.machine_id=4e2dca41797e4f92af897d88f06d9409

However, systemd still writes /etc/machine-id at boot, even when specified on the command line. This detail is not included in https://www.freedesktop.org/software/systemd/man/machine-id.html, however you can see this logic at https://github.com/systemd/systemd/blob/b33c2757d84d4f14f6c31da1c79dc343c43682e2/src/shared/machine-id-setup.c#L135-L161 and the file is populated in etc with the same content.

root@jetson-xavier-nx-devkit-emmc:~# cat /etc/machine-id
4e2dca41797e4f92af897d88f06d9409

This thread provides some systemd philosophy regarding read only rootfs: https://lists.freedesktop.org/archives/systemd-devel/2021-February/046149.html. The summary is they expect /etc to be writeable and don't appear to be contemplating delta updates, at least in that thread. The thread at https://github.com/systemd/systemd/issues/14131 is also interesting, and also doesn't appear to contemplate delta updates.

The part I don't understand yet is why more platforms using mender delta update don't have this problem and where we are different in this regard. My guess is that the comment at https://github.com/OE4T/tegra-demo-distro/discussions/198#discussioncomment-2519034 is related and there's something different about cboot builds which means we don't get the ro kernel argument with IMAGE_FEATURES including read-only-rootfs, probably relates to https://github.com/OE4T/tegra-demo-distro/blob/402f67f3e9ad7bd4c92f4688d75e1a0fbe97224a/layers/meta-tegrademo/classes/rootfs-postcommands-overrides.bbclass in a way I don't understand yet.

madisongh commented 2 years ago

The way I've dealt with this in my projects is to use systemd's volatile root feature, setting systemd.volatile=overlay on the kernel command line, documented here, then handling things like host name and timezone settings with additional programs/scripts to stash the settings in a persistent location elsewhere and restore them at boot time.

Getting this to work right with Yocto builds involves a bit of additional work, essentially turning off the volatile-binds stuff (/var/volatile is unnecessary in this setup) and adjusting files/fs-perms.txt. I have a subset of the changes in my test distro.

There are some issues with overlayfs support in the 4.9 kernel which can cause some issues, so I also wound up importing the back-port of the 4.19 overlayfs support to 4.9 that one of the overlayfs developers maintains to address those.

krisvanrens commented 7 months ago

Ugh, apologies for not getting back to this! The suggested links to discussions and solutions are great. Thanks @dwalkes @madisongh 👍🏻

We ended up adding some systemd scripts to deal with things like setting the host name from the mounted overlay settings etc. Not the prettiest solution, but simple and very reliable. Perhaps in a future incarnation of our Yocto project we will investigate a more thorough approach.