NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.06k stars 14.1k forks source link

systemd-udev-settle.service is overused #73095

Closed lheckemann closed 1 year ago

lheckemann commented 5 years ago

Describe the bug Related: #53446

Using systemd-udev-settle "is not recommended" (man 8 systemd-udev-settle.service). Nevertheless, we have quite a few services depend on it:

$ rg -l systemd-udev-settle
nixos/tests/misc.nix
nixos/modules/hardware/ksm.nix
nixos/modules/security/lock-kernel-modules.nix
nixos/modules/virtualisation/openvswitch.nix
nixos/modules/virtualisation/libvirtd.nix
nixos/modules/virtualisation/lxd.nix
nixos/modules/virtualisation/anbox.nix
nixos/modules/tasks/kbd.nix
nixos/modules/services/x11/xserver.nix
nixos/modules/services/hardware/tcsd.nix
nixos/modules/services/hardware/brltty.nix
nixos/modules/services/networking/dhcpcd.nix
nixos/modules/services/scheduling/atd.nix
nixos/modules/services/hardware/trezord.nix
nixos/modules/services/hardware/acpid.nix
nixos/modules/system/boot/networkd.nix
nixos/modules/system/boot/systemd.nix
nixos/modules/tasks/filesystems/zfs.nix

On some systems/kernels (e.g. mobile-nixos kernels, cc @samueldr @kirelagin), this can cause boot to wait for systemd-udev-settle to time out, which is no fun.

Expected behavior Services depend only on the devices that they need, either by waiting for them themselves or by depending on systemd's relevant .device unit.

Metadata

worldofpeace commented 5 years ago

@lheckemann Should we merge https://github.com/NixOS/nixpkgs/pull/25311 for display-manager? TBH, I've been running a config to disable it for a while now.

bqv commented 4 years ago

As of now...

Removed from:

nixos/modules/services/scheduling/atd.nix    
nixos/modules/services/x11/xserver.nix
nixos/modules/tasks/kbd.nix

Added to:

nixos/modules/config/console.nix
worldofpeace commented 4 years ago

@bqv Thanks. Is the one for xserver.nix in 20.03?

bqv commented 4 years ago

@worldofpeace It is not, only the removal from tasks/kbd.nix and the addition is

jtojnar commented 4 years ago

Had to re-add it to GDM: f74f2f354866c828248a419ef9a2cbddc793b7f9

lheckemann commented 4 years ago

@jtojnar maybe gdm can depend on any DRI device instead of the global systemd-udev-settle.service? This would probably require some udev rules, but be a lot more elegant IMHO.

jtojnar commented 4 years ago

But display-manager.service should already run After systemd-logind.service, which contains:

ExecStartPre=-/nix/store/ijka1p5ndsjq60frdhlq5il0s9mgyypj-kmod-26/sbin/modprobe -abq drm

I would expect that to work, unless we are affected by https://github.com/systemd/systemd/issues/14322.

lheckemann commented 4 years ago

But the kernel module being loaded doesn't necessarily imply the device being available, does it?

asbachb commented 4 years ago

It's possible to remove it from dhcpdc.service? Arch just relies on network.target (https://git.archlinux.org/svntogit/packages.git/tree/trunk/dhcpcd.service?h=packages/dhcpcd)

Looking at the commit does not explain why it was added (https://github.com/NixOS/nixpkgs/commit/4c6171c173ef5e50ecbbc1157c035147462ee721) @srhb

srhb commented 4 years ago

Hi @asbachb . Sorry, I can't remember that far back, but there's some notes in the PR which brought in the commit, maybe it's helpful. It looks like I wasn't strictly happy with it even back then. :)

https://github.com/NixOS/nixpkgs/pull/45421

stale[bot] commented 4 years ago

Hello, I'm a bot and I thank you in the name of the community for opening this issue.

To help our human contributors focus on the most-relevant reports, I check up on old issues to see if they're still relevant. This issue has had no activity for 180 days, and so I marked it as stale, but you can rest assured it will never be closed by a non-human.

The community would appreciate your effort in checking if the issue is still valid. If it isn't, please close it.

If the issue persists, and you'd like to remove the stale label, you simply need to leave a comment. Your comment can be as simple as "still important to me". If you'd like it to get more attention, you can ask for help by searching for maintainers and people that previously touched related code and @ mention them in a comment. You can use Git blame or GitHub's web interface on the relevant files to find them.

Lastly, you can always ask for help at our Discourse Forum or at #nixos' IRC channel.

lheckemann commented 4 years ago

Still valid…

rnhmjoj commented 3 years ago

I think udev-settle can be removed from dhcpcd: it was introduced to compesate for https://github.com/NixOS/nixpkgs/pull/44524, but dhcpcd is not ordered Before network.target anymore. I tried removing the udev-settle dependency and running

nix-build -A nixosTests.predictable-interface-names --check --keep-failed

in a loop and it's always succeeding.

EDIT:

It seems https://github.com/NixOS/nixpkgs/pull/79532 is the reason for the success. So yes, I think the (terrible) udev-settle hack can be safely removed from networkd and dhcpcd.

xaverdh commented 3 years ago

So after https://github.com/NixOS/nixpkgs/pull/79532, does networkd still need udev-settle (was introduced in https://github.com/NixOS/nixpkgs/pull/39340)?

rnhmjoj commented 3 years ago

Btw, I think it's safe to say the issue now is "systemd-udev-settle" is used. This is what systemd will print on every boot starting from the next NixOS release:

Usage of the systemd service unit systemd-udev-settle.service is deprecated. It
inserts artificial delays into the boot process without providing the
guarantees other subsystems traditionally assumed it provides. Relying on this
service is racy, and it is generally a bug to make use of it and depend on it.

Traditionally, this service's job was to wait until all devices a system
possesses have been fully probed and initialized, delaying boot until this
phase is completed. However, today's systems and hardware generally don't work
this way anymore, hardware today may show up any time and take any time to be
probed and initialized. Thus, in the general case, it's no longer possible to
correctly delay boot until "all devices" have been processed, as it is not
clear what "all devices" means and when they have been found. This is in
particular the case if USB hardware or network-attached hardware is used.

Modern software that requires some specific hardware (such as a network device
or block device) to operate should only wait for the specific devices it needs
to show up, and otherwise operate asynchronously initializing devices as they
appear during boot and during runtime without delaying the boot process.

It is a defect of the software in question if it doesn't work this way, and
still pulls systemd-udev-settle.service into the boot process.

Please file a bug report against the following units, with a request for it to
be updated to operate in a hotplug fashion without depending on
systemd-udev-settle.service:

    @OFFENDING_UNITS@
rnhmjoj commented 3 years ago

We are down to ~12~ 2.

rnhmjoj commented 3 years ago

@prusnak the trezord service defined in NixOS pulls in udev-settle as dependency, but the upstream unit never did. Can trezord handle hardware discovery asynchronously or are there issues?

rnhmjoj commented 3 years ago

There may be a good news: Gnome has just merged a patch that looks like it could fix the issue with gdm. I tried it by overriding the gdm package and removing udev-settle, but the gnome3 test still fails, the error looks different, though.

Here's how I tested:

--- a/nixos/modules/services/x11/display-managers/gdm.nix
+++ b/nixos/modules/services/x11/display-managers/gdm.nix
@@ -5,7 +5,14 @@ with lib;
 let

   cfg = config.services.xserver.displayManager;
-  gdm = pkgs.gnome3.gdm;
+  gdm = pkgs.gnome3.gdm.overrideAttrs (old: {
+    patches = old.patches ++ [
+      (pkgs.fetchpatch {
+        url = "https://gitlab.gnome.org/GNOME/gdm/-/merge_requests/128.patch";
+        sha256 = "1h0rf7d1h4b5clpx38yy4j64ailmwjayklz52qnv6lbsx42ix1vj";
+      })
+    ];
+  });

   xSessionWrapper = if (cfg.setupCommands == "") then null else
     pkgs.writeScript "gdm-x-session-wrapper" ''
@@ -168,9 +175,6 @@ in
       "systemd-machined.service"
       # setSessionScript wants AccountsService
       "accounts-daemon.service"
-      # Failed to open gpu '/dev/dri/card0': GDBus.Error:org.freedesktop.DBus.Error.AccessDenied: Operation not permitted
-      # https://github.com/NixOS/nixpkgs/pull/25311#issuecomment-609417621
-      "systemd-udev-settle.service"
     ];

     systemd.services.display-manager.after = [
@@ -180,7 +184,6 @@ in
       "getty@tty${gdm.initialVT}.service"
       "plymouth-quit.service"
       "plymouth-start.service"
-      "systemd-udev-settle.service"
     ];
     systemd.services.display-manager.conflicts = [
       "getty@tty${gdm.initialVT}.service"

@worldofpeace @jtojnar Any change you might take a look at this? I hope I'm doing something wrong.

prusnak commented 3 years ago

@rnhmjoj fixed trezord.service in 01f1773e8e12742bd2d51d7fc163b373f0dd3ba3

rnhmjoj commented 3 years ago

nixos/modules/security/lock-kernel-modules.nix is interesting. It disable further loading of kernel modules at runtime, so it should run as late as possible to not interfere with udev doing any modprobes. Since the service doesn't really depend on any specific device, it's not easy to replace udev-settle...

ping @joachifm @emilazy

rnhmjoj commented 3 years ago

@ts468 @netixx Can you test if openvswitch (specifically ovsdb.service) really needs udev-settle or can be removed?

worldofpeace commented 3 years ago

https://gitlab.gnome.org/GNOME/gdm/-/merge_requests/128.patch

Will look

worldofpeace commented 3 years ago

Yeah, the message pasted to the module should've noted that the error came from gnome-shell, if that's relevant

Mar 01 23:09:57 nixos .gnome-shell-wr[914]: Failed to open gpu '/dev/dri/card0': GDBus.Error:org.freedesktop.DBus.Error.AccessDenied: Operation not permitted
Mar 01 23:09:57 nixos .gnome-shell-wr[914]: Failed to create backend: No GPUs found

So that patch didn't seem to help.

worldofpeace commented 3 years ago

Will read the PRs, issues, etc. in more detail, but that patch should fix this situation. Might try an actual 40.beta before I report that there's a problem for us.

domenkozar commented 3 years ago

I can confirm a similar issue with missing /dev/dri/card0 on 20.09 with i3+xfce. It does not reproduce on master.

Workaround for 20.09:

  systemd.services.display-manager.after = [ "systemd-udev-settle.service" ];
peterhoeg commented 3 years ago

We have a separate issue it seems. The canonical way of solving this is by using the modprobe@.service to ensure a given module is loaded and then have a dependency on modprobe@drm.service from sddm/gdm. However, we are not installing that unit and it needs to be patched as the path to modprobe is in /sbin. Do you mind trying with a dummy module that does that and then introduce a dependency on that instead?

peterhoeg commented 3 years ago

Another thing that would be fun to try as I don't think it's the udev settle thing that does it - we are dealing with a race condition here (see https://github.com/sddm/sddm/issues/1316) and I think that udev-settle just takes long enough for the devices to be in place.

Instead of your workaround, can you try adding systemd.services.display-manager.serviceConfig.ExecStartPre = "/bin/sh -c 'sleep 3'"; to see if any delay will work. My guess is that it will.

rnhmjoj commented 3 years ago

Pinging openvswitch maintainers/users: @ts468 @netixx @risicle @nspin @kmcopper Does anyone known or can test if openvswitch (ovsdb.service) works without udev-settle? Thanks.

rnhmjoj commented 3 years ago

Finally GDM appears to be working without udev-settle. We're now left with only three modules

  1. lock-kernel-modules.nix: probably the only service where udev-settle might be somewhat appropriate.
  2. zfs.nix: this is up to upstream to fix (openzfs/zfs#100891)
  3. openvswitch.nix: @edude03, can you take a look, please?
peterhoeg commented 3 years ago

Regarding lock-kernel-modules.nix, I see a few options:

  1. simply run udevadm settle as part of the script run by the disable-kernel-module-loading unit
  2. turn it into a timer that runs a configurable time after boot. An admin of a given system would say "90 seconds after boot" is when it should look down.
  3. introduce a new target that we're booting to instead of the regular multi-user.target that pulls in systemd-udev-settle after the "normal" target

Option 1 seems like the most straight-forward option to me.

rnhmjoj commented 3 years ago

A timer is also pretty simple to set up. Something like this should work:

  systemd.timers.disable-kernel-module-loading = {
    wantedBy = [ "timers.target" ];
    timerConfig.OnBootSec = cfg.timeout;
  };

The problem is that, unless the user picks up an appropriate value, it could lock the system too early. So, yes option 1 is the safest, I think.

peterhoeg commented 3 years ago

The problem is that, unless the user picks up an appropriate value, it could lock the system too early. So, yes option 1 is the safest, I think.

Option 1 gives us the behaviour closest to the current situation, that's for sure. I can also see a case for the timer option but if the purpose is to get rid of the warning, then option 1 is a no-brainer.

stale[bot] commented 2 years ago

I marked this as stale due to inactivity. → More info

rnhmjoj commented 2 years ago

This is still relevant, but not a big deal: the only remaining items are openvswitch and zfs. The former seems unmaintained in Nixpkgs and the latter still needs to be fixed by upstream.

peterhoeg commented 2 years ago

We could handle this by having a NixOS option that disables systemd-udev-settle.service by default unless zfs or openvswitch are enabled.

rnhmjoj commented 2 years ago

That or not installing the unit at all and adding a preStart="udevadm settle"; where needed, instead.

pmarreck commented 2 years ago

Is there any way to reduce the amount of time it waits? I'm root on ZFS (which is very nice otherwise, but otherwise makes this check futile, since the pools are obviously already mounted at that point or it wouldn't be booting) and it adds 7 seconds to my boot time, and is currently also the most time-consuming thing in my boot:

❯ systemd-analyze blame
7.101s systemd-udev-settle.service

Attempting to mask it in my configuration results in this, probably due to the zfs dependency:

Failed to start zfs-import.target: Unit systemd-udev-settle.service is masked.
warning: error(s) occurred while switching to the new configuration

I will ping the other ticket on openzfs as well

peterhoeg commented 2 years ago

As a quick fix, how about just overriding it:

{
  systemd.services.zfs-import-cache = {
    requires = [ "" "whatever-it-does-require.service" ];
  };
}

The trick is the "" which clears out whatever we had before.

pmarreck commented 2 years ago

@peterhoeg I mean... that might work, although it looks pretty hacky. I do use an L2ARC on an NVMe drive, would that prevent a rebuild of the L2ARC cache after reboot? (i.e., the persistence feature that was introduced relatively recently in openzfs)

peterhoeg commented 2 years ago

While I have zfs on 2 machines, I simply don't know enough to be able to say anything remotely intelligent about your case. You could try having zfs-import-cache do wants = [ "whatever-devices" ]; and see if that does the trick. But obviously testing in production this way, isn't great.

rnhmjoj commented 1 year ago

I'm closing as zfs is the only remaining service and it up to upstream to fix it.