nix-community / NixOS-WSL

NixOS on WSL(2) [maintainer=@nzbr]
Apache License 2.0
1.85k stars 118 forks source link

/etc/machine-id not persistent #574

Open sedlund opened 1 week ago

sedlund commented 1 week ago

Bug description

/etc/machine-id is on tmpfs via etc-machine\x2did.mount unit.

this causes a new machine-id to be created each boot.

various issues will transpire because of this.

journald writes its logs to /var/log/$(cat /etc/machine-id)

when vacuuming it will only consider those journals.

If you configure SystemMaxUse=200M in /etc/systemd/journald.conf it will only vacuum the current boot logs leading to growing /var/log/journal

journalctl --list-boots will only show the current boot.

To Reproduce

Steps to reproduce the behavior:

cat /etc/machine-id reboot cat /etc/machine-id

WSL version


WSL version: 2.3.24.0
Kernel version: 5.15.153.1-2
WSLg version: 1.0.65
MSRDC version: 1.2.5620
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26100.1-240331-1435.ge-release
Windows version: 10.0.22631.4317```

<!-- If your issue is related to the installation process, please include the SHA256 checksum of the tarball you used to install NixOS-WSL -->
sedlund commented 1 week ago

running systemd-machine-id-setup --commit manually resolves by removing the tmpfs mount and writing the current machine-id to a static file surviving reboots.

although this should be ran by systemd-machine-id-commit.service (8) once /etc is writable per its man page and machine-id (5) . so not sure why that step is not occurring.

SuperSandro2000 commented 1 week ago

/etc/machine-id is on tmpfs via etc-machine\x2did.mount unit.

Where is that coming from?

sedlund commented 1 week ago

I believe from systemd. per machine-id (5):

FIRST BOOT SEMANTICS
       /etc/machine-id is used to decide whether a boot is the first one. The rules are as follows:

        1. The kernel command argument systemd.condition_first_boot= may be used to override the autodetection logic, see
           kernel-command-line(7).

        2. Otherwise, if /etc/machine-id does not exist, this is a first boot. During early boot, systemd will write
           "uninitialized\n" to this file and overmount a temporary file which contains the actual machine ID. Later
           (after first-boot-complete.target has been reached), the real machine ID will be written to disk.

        3. If /etc/machine-id contains the string "uninitialized", a boot is also considered the first boot. The same
           mechanism as above applies.

        4. If /etc/machine-id exists and is empty, a boot is not considered the first boot.  systemd will still bind-mount
           a file containing the actual machine-id over it and later try to commit it to disk (if /etc/ is writable).

        5. If /etc/machine-id already contains a valid machine-id, this is not a first boot.

       If according to the above rules a first boot is detected, units with ConditionFirstBoot=yes will be run and systemd
       will perform additional initialization steps, in particular presetting units.

seems to be in step 4 that the tmpfs bind mount is created. if you manually umount /etc/machine-id the shadowed /etc/machine-id file is there with contents "uninitialized"

after this the systemd-machine-id-commit.service (8) should oneshot run to umount the tmpfs, write the current machine-id to a static file.

i see the systemd-machine-id-commit.service has run on my ubuntu wsl image.

if you've tried to do systemd-machine-id-setup --commit you can still remove /etc/machine-id and the same behaviour repeats after reboot. it will not finalize a static /etc/machine-id

sedlund commented 1 day ago

Seems this was just changed in nixpkgs

https://github.com/NixOS/nixpkgs/pull/351151

There is also this generator for disk images here:

https://github.com/NixOS/nixpkgs/blob/da6da7189e85a403396cef07fe5825a7144b0d84/nixos/modules/system/boot/loader/systemd-boot/systemd-boot-builder.py#L287

that manually writes the file without systemd.

not sure which direction this should go