dracutdevs / dracut

dracut the event driven initramfs infrastructure
https://github.com/dracutdevs/dracut/wiki
GNU General Public License v2.0
604 stars 400 forks source link

systemd network-online.target not stopped before switching root #1376

Open pjgeorg opened 3 years ago

pjgeorg commented 3 years ago

Describe the bug systemd network-online.target is reached within initrd (when using systemd and network-manager modules) and not stopped before switching root. This leads to systemd services of "real" root to start immediately despite requiring nerwork connections to be brought up (i.e. configured as described here). This is an issue as even with network already brought up in initrd, often only a part of the network interfaces are brought up. E.g. initrd often only brings up one network interface required to pxe boot, the public network interface is brought up after switch root. Hence systemd services that do require the public network interface simply fail.

Current work-around: Instead of specifying network-online.target for Wants= and After= use NetworkManager-wait-online.service

Distribution used CentOS Stream 8

Dracut version 049-135.git20210121.el8

Init system systemd

To Reproduce Enable a systemd service requiring (Wants= and After=) network-online.target

Expected behavior systemd starts the service after network is "online", i.e. NetworkManager-wait-online.service (part of network-online.target) has been started.

Additional context

haraldh commented 3 years ago

CC @dustymabe

haraldh commented 3 years ago

CC @lnykryn

jlebon commented 3 years ago

(@dustymabe pointed me to this issue)

systemd network-online.target is reached within initrd (when using systemd and network-manager modules) and not stopped before switching root.

Hmm, that's definitely odd. At switchroot, the network-online.target should be stopped just like everything else. E.g. here on FCOS:

[core@cosa-devsh ~]$ journalctl --grep '(Network is Online|Switching root)'
-- Journal begins at Wed 2021-04-21 14:27:35 UTC, ends at Wed 2021-04-21 14:33:33 UTC. --
Apr 21 14:27:37 cosa-devsh systemd[1]: Reached target Network is Online.
Apr 21 14:28:02 cosa-devsh systemd[1]: Stopped target Network is Online.
Apr 21 14:28:03 cosa-devsh systemd[1]: Switching root.
Apr 21 14:28:04 cosa-devsh systemd[1]: Switching root.
Apr 21 14:28:07 cosa-devsh systemd[1]: Reached target Network is Online.

Would be useful to compare to the output of this command on your machine.

Hmm, I wonder if what's going on here is that Network-Manager-wait-online.service succeeds immediately because networking is still up from the initrd phase? But if you're defining additional connections which should be brought up by NM, then I would have expected that service to wait for NM to do that. Would help to have NM devs look at this too.

pjgeorg commented 3 years ago

@jlebon Sorry, forgot to include logs in the initial bug report:

ssh qpz-node-004 "journalctl --grep '(Network is Online|Switching root|Network Manager Wait)'"
-- Logs begin at Wed 2021-04-21 17:32:06 CEST, end at Wed 2021-04-21 17:40:18 CEST. --
Apr 21 17:33:12 localhost systemd[1]: Reached target Network is Online.
Apr 21 17:33:12 localhost systemd[1]: Switching root.
Apr 21 17:33:15 qpz-node-004 systemd[1]: Starting Network Manager Wait Online...
Apr 21 17:33:22 qpz-node-004 systemd[1]: Started Network Manager Wait Online.

I added "Network Manager Wait" compared to your command to actually show that not all network interfaces are "online" yet.

So it seems that for some reasons the network-online.target is not stopped in my case.

Note: I'm booting this system via pxe boot, i.e. sysroot is mounted via NFS. Not sure this has anything to do with this particular issue.