NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.28k stars 14.27k forks source link

Systemctl kexec always boots /nix/var/nix/profiles/system even if custom profile is selected in nixos-rebuild #50300

Open arianvp opened 6 years ago

arianvp commented 6 years ago

Issue description

When doing nixos-rebuild boot -p my-profile and then systemctl kexec I would expect it to kexec into the just created profile, but it kexecs into my previous nixos configuration that is in /var/nix/profiles/system

Steps to reproduce

sudo nixos-rebuild boot  # make an entry in the system profile
# change /etc/nixos/configuration.nix
sudo nixos-rebuild boot -p hello  # make an entry in /var/nix/profiles/system-profiles/
sudo systemctl kexec

And we're rebooted into the old config instead of the new one.

This seems to be because the kexec module hardcodes the fact that boots are inside /var/nix/profiles/system

https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/system/boot/kexec.nix#L16-L18

Technical details

Please run nix-shell -p nix-info --run "nix-info -m" and paste the results.

arianvp commented 6 years ago

Some hints in https://wiki.archlinux.org/index.php/kexec with a templated kexec-load@.service. However this is a bit problematic as /etc/systemd is read-only in our case. So it's not clear how to adapt this.

lheckemann commented 6 years ago

I'd think that adding a symlink like /run/next-boot-system would be a way to fix this.

flokli commented 4 years ago

@lheckemann hmmh, that could work, but then it shouldn't be a template unit, and have a condition on that file to exist. Still, /run/next-boot-system introduces yet another magic NixOS-specific symlink, where it doesn't necessarily need to be.

The kexec-load@.service template unit from the archlinux wiki is just a fancy way of telling systemd to kexec --load a new kernel doing the real kexec, by doing this in a unit that's added to kexec.target.

We should be able to do the same by just adding a non-templated unit to a fixed location in /run/systemd/system (and adding it to kexec.target).

We could also try to invoke kexec --load (without the --exec argument of course) on kexec-able platforms during nixos-rebuild boot to load the "next-boot" kernel, initrd and cmdline into memory.

flokli commented 4 years ago

Also interesting: https://lists.freedesktop.org/archives/systemd-devel/2012-March/004764.html

flokli commented 4 years ago

Turns out, there already is nixos/modules/system/boot/kexec.nix, which creates a prepare-kexec.service withWantedBy=kexec.targetthat loads the kernel from/nix/var/nix/profiles/system/kernel`.

I built a new kernel with a custom cmdline (by setting boot.kernelParams = ["foo"]';), and after a nixos-rebuild boot, /nix/var/nix/profiles/system/kernel-params contains "foo".

Are you sure this is still a problem?

flokli commented 4 years ago

I tried a systemctl kexec, but my system froze.

That might be hardware-specific, or caused by plymouth, but in general, it seems to work, and the proper kernel and cmdline are at the expeced location. @arianvp, can you confirm?

lheckemann commented 4 years ago

@flokli

When doing nixos-rebuild boot -p my-profile

The problem is that /nix/var/nix/profiles/system isn't necessarily the right profile to boot.

stale[bot] commented 4 years ago

I marked this as stale due to inactivity. → More info

flokli commented 4 years ago

So, I just did a nixos-rebuild boot to a new system configuration, and the /nix/var/nix/profiles/system symlink did change.

nixos-rebuild boot apparently changes that symlink, and then invokes bin/switch-to-configuration boot, which will update boot loader entries.

I did skim through src/systemctl/systemctl.c, it seems systemd will either use an already kexec --loaded kernel, or retrieve what to load from boot_config_default_entry before doing the kexec. At least when using sd-boot, I'd expect this to work?

I'm somewhat hesitant to just introduce a kexec --load into the bin/switch-to-configuration boot code - this might not work on some platforms, and might make switching slower, too.

Maybe we could add a script to /etc/systemd/system-shutdown:

   Immediately before executing the actual system halt/poweroff/reboot/kexec systemd-shutdown will run all executables in
   /usr/lib/systemd/system-shutdown/ and pass one arguments to them: either "halt", "poweroff", "reboot" or "kexec", depending on the chosen action.
   All executables in this directory are executed in parallel, and execution of the action is not continued before all executables finished.

Such script could kexec --load from /nix/var/nix/profiles/system in the kexec case if nothing is kexec --loaded already (which could be both the user running kexec --load, or systemd by boot_config_default_entry.

Having it that way, it'd be fairly decoupled and could easily be disabled from the NixOS module system.

lheckemann commented 4 years ago

@flokli I don't think that will fix the issue originally reported by @arianvp, which is that this doesn't work with profiles other than system, as in nixos-rebuild boot -p my-profile.

I might be remembering wrong, but systemd will iirc perform a kexec instead of a reboot if anything is loaded, which might also be an unpleasant surprise if switch-to-configuration did kexec -l, so I agree that this probably isn't a good idea.

flokli commented 4 years ago

Yeah, custom profiles are another beast.

Is /run/systemd/system-shutdown a thing? In that case, switch-to-configuration boot could drop a script doing a kexec there…

Edit: It's not, but can be solved as part of https://github.com/NixOS/nixpkgs/issues/80038 in https://github.com/NixOS/nixpkgs/blob/3821543de7ec3f9a19bdbd7ec0bfd98b9b3253f3/pkgs/os-specific/linux/systemd/0016-systemd-sleep-execute-scripts-in-etc-systemd-system-.patch.

arianvp commented 4 years ago

I'm confused. kexec works fine. The issue is specifically about kexec'ing into a different profile.

i want systemctl reboot and systemctl kexec to end up in the same profile; which is currently not the case.

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

orangecms commented 2 years ago

I just ran into this as well, really confusing. I would suggest binding a mutable overlay on top of the current system to simplify this, also for other things one might wish to change at runtime. What do you think?

haslersn commented 2 years ago

Currently, on NixOS 21.11, I have the following problem. Not sure if it's related.

# systemctl kexec
No kexec kernel loaded and autodetection failed.
Cannot automatically load kernel: ESP mount point not found.
arianvp commented 6 months ago

https://github.com/NixOS/nixpkgs/pull/309911 adds a /run/next-system symlink and implements the desired behaviour for soft-reboot. Maybe w e can do the same for kexec

arianvp commented 6 months ago

i want systemctl reboot and systemctl kexec to end up in the same profile; which is currently not the case.

@flokli if I understand you correctly If we'd remove the nixos/modules/system/boot/kexec.nix module, systemctl kexec should already load the new boot loader entry from the bootloader and boot into the correct one. Of course this would only work for systemd-boot. But it does sound like it would implement the desired property as well.

arianvp commented 6 months ago

But I think the /run/next-system avenue is an OK one. Given it also helps with soft-reboot.