coreos / fedora-coreos-tracker

Issue tracker for Fedora CoreOS
https://fedoraproject.org/coreos/
262 stars 59 forks source link

[rawhide]: network-online.target activates in rawhide #1380

Closed gursewak1997 closed 1 year ago

gursewak1997 commented 1 year ago

Describe the bug

Kola test ext.config.networking.network-online-service fails after network-online.target activates beforehand. By default we no longer pull in network-online.target and hence the network-online.target should be inactive.

Rawhide Build console.txt From console.txt

         Starting NetworkManager-di…nager Script Dispatcher Service...
[  OK  ] Listening on systemd-rfkil…l Switch Status /dev/rfkill Watch.
[  OK  ] Started NetworkManager-dis…Manager Script Dispatcher Service.
[  OK  ] Finished NetworkManager-wa…[0m - Network Manager Wait Online.
[  OK  ] Reached target network-online.target - Network is Online.

Reproduction steps

  1. Build with the latest rawhide changes/packages and run systemctl show -p ActiveState network-online.target. It will tell that the network-online.target is active.

Expected behavior

Unit network-online.target remains inactive

Jan  6 12:58:10.229074 systemd[1]: Started kola-runext-43.service.
Jan  6 12:58:10.229683 kola-runext-network-online-service[6857]: ++ cmdline=($(< /proc/cmdline))
Jan  6 12:58:10.230113 kola-runext-network-online-service[6859]: + grep -q ActiveState=inactive
Jan  6 12:58:10.232389 kola-runext-network-online-service[6858]: + systemctl show -p ActiveState network-online.target
Jan  6 12:58:10.238962 unknown[6853]: Awaiting events
Jan  6 12:58:10.253119 kola-runext-network-online-service[6857]: + ok 'unit network-online.target inactive'
Jan  6 12:58:10.253119 kola-runext-network-online-service[6857]: + echo ok 'unit network-online.target inactive'
Jan  6 12:58:10.253119 kola-runext-network-online-service[6857]: ok unit network-online.target inactive
Jan  6 12:58:10.256667 unknown[6853]: Dispatching kola-runext-43.service```

Actual behavior

[2023-01-12T13:02:12.649Z] Jan 12 13:02:08 qemu0 systemd[1]: Started kola-runext-43.service.
[2023-01-12T13:02:12.649Z] Jan 12 13:02:08 qemu0 kola-runext-network-online-service[1831]: ++ cmdline=($(< /proc/cmdline))
[2023-01-12T13:02:12.649Z] Jan 12 13:02:08 qemu0 kola-runext-network-online-service[1833]: + grep -q ActiveState=inactive
[2023-01-12T13:02:12.649Z] Jan 12 13:02:08 qemu0 kola-runext-network-online-service[1832]: + systemctl show -p ActiveState network-online.target
[2023-01-12T13:02:12.649Z] Jan 12 13:02:08 qemu0 kola-runext-network-online-service[1831]: + fatal 'Unit network-online.target shouldn'\''t be active'
[2023-01-12T13:02:12.649Z] Jan 12 13:02:08 qemu0 kola-runext-network-online-service[1831]: + echo 'Unit network-online.target shouldn'\''t be active'
[2023-01-12T13:02:12.649Z] Jan 12 13:02:08 qemu0 kola-runext-network-online-service[1831]: Unit network-online.target shouldn't be active
[2023-01-12T13:02:12.649Z] Jan 12 13:02:08 qemu0 kola-runext-network-online-service[1831]: + exit 1

System details

[rawhide][x86_64]  38.20230112.91.0

Ignition config

No response

Additional information

No response

gursewak1997 commented 1 year ago

From the initial investigation, it looks like the changes in the latest fedora-release package activate the network-online.target; resulting in the test failure. Pinning to the previous fedora-release package for now in rawhide.

dustymabe commented 1 year ago

So it's this transition that introduced the issue:

fedora-release-common 38-0.8 -> 38-0.15
fedora-release-coreos 38-0.8 -> 38-0.15
fedora-release-identity-coreos 38-0.8 -> 38-0.15

Seems like one of the commits in the last 7 days introduced some side effect here.

dustymabe commented 1 year ago

Looks like it's the enabling of podman-restart.service because it has Wants=network-online.target.

cc @jlebon on the recommended path forward here since he notably wrote the nice explanation in https://github.com/coreos/fedora-coreos-config/pull/1088

jlebon commented 1 year ago

I think this usage of network-online.target is not ideal, but OK. podman-restart.service is a "leaf" service, so shouldn't be delaying boot in a way that significantly affects the UX (whereas the pinger service also had Before=systemd-user-sessions.service). But still, it'd be nice to strengthen podman semantics so it could be avoided, so I added a comment in https://bugzilla.redhat.com/show_bug.cgi?id=2149642.

jlebon commented 1 year ago

The commit has been reverted. Once we have a new rawhide build of fedora-release, we should be able to revert https://github.com/coreos/fedora-coreos-config/pull/2172.

dustymabe commented 1 year ago

https://github.com/coreos/fedora-coreos-config/pull/2192 closes the loop on this.