Open lf- opened 9 months ago
Repros in 254 also. I have pushed a new commit to the gist reproducer to use the updated NixOS version with systemd 254. Let me know if you need further details.
[root@nixos:~]# systemctl list-jobs
JOB UNIT TYPE STATE
223 systemd-networkd-wait-online.service start running
130 multi-user.target start waiting
222 network-online.target start waiting
3 jobs listed.
[root@nixos:~]# systemctl --version
systemd 254 (254.3)
+PAM +AUDIT -SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +BPF_FRAMEWORK -XKBCOMMON +UTMP -SYSVINIT default-hierarchy=unified
If you configure that all interfaces are not required for online, then why don't you disable systemd-networkd-wait-online.service ?? Enabling wait-online with such configuration is completely meaningless.
I guess, this is 'caused' by 2f96a29c2c55bdd67cdd8e0b0cfd6971968e4bca (v254, backported to v253 as https://github.com/systemd/systemd-stable/commit/abbd24e8a51d9b6ffcf99c8cfe89d9faba23ebdb (v253.6)). I think we cannot fix your 'issue' without re-introducing issue #27822.
Agh, of course, I half guessed there was such a reason. This is then a NixOS bug, imo. Let me go file a PR there to turn off the wait-online service if nothing is configured.
Although: systemd could plausibly notice that there are no possible config files that could make any interface that might appear RequiredForOnline, at least in theory?
Hrmm. OK NixOS has their own reasons: they want to avoid accidentally adding requirements to be online https://github.com/nixos/nixpkgs/blob/1f34babe84854576c936969f8a879403be9f2515/nixos/modules/tasks/network-interfaces-systemd.nix#L48-L72. This, to me, feels like the way that systemd expresses this is insufficiently rich to represent that?
networks."99-ethernet-default-dhcp" = {
# We want to match physical ethernet interfaces as commonly
# found on laptops, desktops and servers, to provide an
# "out-of-the-box" setup that works for common cases. This
# heuristic isn't perfect (it could match interfaces with
# custom names that _happen_ to start with en or eth), but
# should be good enough to make the common case easy and can
# be overridden on a case-by-case basis using
# higher-priority networks or by disabling useDHCP.
# Type=ether matches veth interfaces as well, and this is
# more likely to result in interfaces being configured to
# use DHCP when they shouldn't.
# When wait-online.anyInterface is enabled, RequiredForOnline really
# means "sufficient for online", so we can enable it.
# Otherwise, don't block the network coming online because of default networks.
matchConfig.Name = ["en*" "eth*"];
DHCP = "yes";
linkConfig.RequiredForOnline =
lib.mkDefault (if initrd
then config.boot.initrd.systemd.network.wait-online.anyInterface
else config.systemd.network.wait-online.anyInterface);
networkConfig.IPv6PrivacyExtensions = "kernel";
};
When such catch-all config is installed by default, then --any
option for wait-online should be set by default too, and do not disable RequiredForOnline=
.
Although: systemd could plausibly notice that there are no possible config files that could make any interface that might appear RequiredForOnline, at least in theory?
Theoretically, yes. However, wait-online does not parse .network files, so networkd needs to expose the info and wait-online needs to monitor that... That sounds overkill for me.
systemd version the issue has been seen with
253.6
Used distribution
NixOS unstable (5148520bfab6, from 2 weeks ago)
Linux kernel version used
6.1.53
CPU architectures issue was seen on
x86_64
Component
systemd-networkd, systemd-networkd-wait-online
Expected behaviour you didn't see
I expect systemd-networkd-wait-online to exit expediently if there are no interfaces present that are required for online.
Unexpected behaviour you saw
systemd-networkd-wait-online waits until it times out.
Steps to reproduce the problem
This was observed in a qemu VM providing an ethernet interface with DHCP.
You can launch a x86_64 VM exhibiting this bug in one command:
See the NixOS config causing it here: https://gist.github.com/lf-/3ad92d06adfec348fa7090e4b5188c51
I believe that the necessary conditions to cause this bug are that there are no interfaces marked as RequiredForOnline based on poking around the code, but I'm not sure.
I have dumped a bunch of the generated configs of the system below, hopefully enough that if you don't want to use Nix, it is still reproducible.
Additional program output to the terminal or log subsystem illustrating the issue