skiffos / SkiffOS

Any Linux distribution, anywhere.
https://skiffos.com
MIT License
707 stars 52 forks source link

next: systemd fails to connect to system bus #52

Closed paralin closed 6 years ago

paralin commented 6 years ago

This may be causing issues with NetworkManager as well. NetworkManager seems to not be marked as started up (it uses dbus) so it is restarted continuously, messing up networking.

Systemd Version

[    7.659660] systemd[1]: systemd 237 running in system mode. (+PAM -AUDIT +SELINUX -IMA -APPARMOR -SMACK +SYSVINIT +UTMP -LIBCRYPTSETUP +GCRYPT -GNUTLS -ACL +XZ +LZ4 -SECCOMP +BLKID -ELFUTILS +KMOD -IDN2 -IDN -P

Other Versions

Welcome to SkiffOS f980293!

Logs

Feb 14 15:30:28 skiff-4c2202a dbus-daemon[317]: [system] Activating systemd to hand-off: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.0' (uid=0 pid=318 comm="/usr/sbin/NetworkManager --no-daemon ")
Feb 14 15:30:28 skiff-4c2202a mount-all.sh[165]: Build date of Wed Feb 14 15:30:24 EST 2018 is newer than internal clock of Sun Jan 28 15:58:20 UTC 2018. Bumping internal clock to build date.
Feb 14 15:30:27 skiff-4c2202a systemd[1]: System is tainted: var-run-bad
Feb 14 15:30:27 skiff-4c2202a systemd[1]: Starting Network Manager...
Feb 14 15:30:27 skiff-4c2202a NetworkManager[318]: <info>  [1518640227.9720] NetworkManager (version 1.10.2) is starting... (for the first time)
Feb 14 15:30:27 skiff-4c2202a NetworkManager[318]: <info>  [1518640227.9724] Read config: /etc/NetworkManager/NetworkManager.conf
Feb 14 15:30:27 skiff-4c2202a NetworkManager[318]: <info>  [1518640227.9998] manager[0xc7f030]: monitoring kernel firmware directory '/lib/firmware'.
Feb 14 15:30:53 skiff-4c2202a NetworkManager[318]: <info>  [1518640253.0313] hostname: hostname: hostnamed not used as proxy creation failed with: Error calling StartServiceByName for org.freedesktop.hostname1: Timeout was reached
Feb 14 15:30:53 skiff-4c2202a dbus-daemon[317]: [system] Failed to activate service 'org.freedesktop.hostname1': timed out (service_start_timeout=25000ms)
Feb 14 15:30:53 skiff-4c2202a NetworkManager[318]: <info>  [1518640253.0317] hostname: hostname changed from (none) to "skiff-4c2202a"
Feb 14 15:30:53 skiff-4c2202a kernel: NET: Registered protocol family 10
Feb 14 15:30:53 skiff-4c2202a dbus-daemon[317]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Feb 14 15:30:53 skiff-4c2202a NetworkManager[318]: <info>  [1518640253.0331] dns-mgr[0xc77ac8]: init: dns=default, rc-manager=symlink
Feb 14 15:30:53 skiff-4c2202a dbus-daemon[317]: [system] Activating systemd to hand-off: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.0' (uid=0 pid=318 comm="/usr/sbin/NetworkManager --no-daemon ")
Feb 14 15:30:53 skiff-4c2202a NetworkManager[318]: <info>  [1518640253.0355] manager[0xc7f030]: rfkill: WiFi hardware radio set enabled
Feb 14 15:30:53 skiff-4c2202a NetworkManager[318]: <info>  [1518640253.0357] manager[0xc7f030]: rfkill: WWAN hardware radio set enabled
Feb 14 15:31:18 skiff-4c2202a dbus-daemon[317]: [system] Failed to activate service 'org.freedesktop.nm_dispatcher': timed out (service_start_timeout=25000ms)
Feb 14 15:31:18 skiff-4c2202a dbus-daemon[317]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <error> [1518640278.1544] dispatcher: could not get dispatcher proxy! Error calling StartServiceByName for org.freedesktop.nm_dispatcher: GDBus.Error:org.freedesktop.DBus.Error.TimedOut: Failed to activate service 'org.freedesktop.nm_dispatcher': timed out (service_start_timeout=25000ms)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.1550] settings: loaded plugin keyfile: (c) 2007 - 2016 Red Hat, Inc.  To report bugs please use the NetworkManager mailing list.
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <warn>  [1518640278.1552] settings: could not load plugin 'wifi' from file '/usr/lib/NetworkManager/libnm-settings-plugin-wifi.so': No such file or directory
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.1573] settings: loaded plugin iBFT: (c) 2014 Red Hat, Inc.  To report bugs please use the NetworkManager mailing list. (/usr/lib/NetworkManager/libnm-settings-plugin-ibft.so)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <warn>  [1518640278.1575] settings: could not load plugin 'adsl' from file '/usr/lib/NetworkManager/libnm-settings-plugin-adsl.so': No such file or directory
Feb 14 15:31:18 skiff-4c2202a kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <warn>  [1518640278.1592] keyfile: error loading connection from file /etc/NetworkManager/system-connections/readme: invalid connection: connection.type: property is missing
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.2562] keyfile: new connection /etc/NetworkManager/system-connections/IP Daily 5G (90291557-93f8-4bae-aaa8-a1c528d32bbc,"IP Daily 5G")
Feb 14 15:31:18 skiff-4c2202a kernel: smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Feb 14 15:31:18 skiff-4c2202a kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3505] keyfile: new connection /etc/NetworkManager/system-connections/IP Daily (90291557-93f8-4bae-aaa8-a1c528d32bbb,"IP Daily")
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3600] manager: rfkill: WiFi enabled by radio killswitch; enabled by state file
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3603] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3604] manager: Networking is enabled by state file
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3608] dhcp-init: default DHCP client 'dhclient' is not available
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3611] dhcp-init: Using DHCP client 'dhcpcd'
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3613] Loaded device plugin: NMBondDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3615] Loaded device plugin: NMBridgeDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3616] Loaded device plugin: NMDummyDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3617] Loaded device plugin: NMEthernetDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3618] Loaded device plugin: NMInfinibandDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3619] Loaded device plugin: NMIPTunnelDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3621] Loaded device plugin: NMMacsecDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3622] Loaded device plugin: NMMacvlanDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3623] Loaded device plugin: NMPppDeviceFactory (internal)
Feb 14 15:31:19 skiff-4c2202a kernel: random: crng init done
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3624] Loaded device plugin: NMTunDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3625] Loaded device plugin: NMVethDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3626] Loaded device plugin: NMVlanDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3627] Loaded device plugin: NMVxlanDeviceFactory (internal)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3648] Loaded device plugin: NMAtmManager (/usr/lib/NetworkManager/libnm-device-plugin-adsl.so)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3671] Loaded device plugin: NMWifiFactory (/usr/lib/NetworkManager/libnm-device-plugin-wifi.so)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3722] device (lo): carrier: link connected
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3776] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3884] manager: (eth0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.3967] keyfile: add connection in-memory (27c3fad5-1367-335c-905b-5df61bc50c1d,"Wired connection 1")
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.4001] settings: (eth0): created default wired connection 'Wired connection 1'
Feb 14 15:31:18 skiff-4c2202a NetworkManager[318]: <info>  [1518640278.4086] device (eth0): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Feb 14 15:31:23 skiff-4c2202a NetworkManager[318]: <info>  [1518640283.5457] manager: startup complete
Feb 14 15:31:58 skiff-4c2202a systemd[1]: NetworkManager.service: Start operation timed out. Terminating.
Feb 14 15:31:58 skiff-4c2202a NetworkManager[318]: <info>  [1518640318.1074] caught SIGTERM, shutting down normally.
Feb 14 15:31:58 skiff-4c2202a NetworkManager[318]: <info>  [1518640318.1172] exiting (success)
Feb 14 15:31:58 skiff-4c2202a systemd[1]: NetworkManager.service: Failed with result 'timeout'.
Feb 14 15:31:58 skiff-4c2202a systemd[1]: Failed to start Network Manager.
Feb 14 15:31:58 skiff-4c2202a systemd[1]: Reached target Network.
Feb 14 15:31:58 skiff-4c2202a systemd[1]: Starting Permit User Sessions...
Feb 14 15:31:58 skiff-4c2202a systemd[1]: Reached target Network is Online.
Feb 14 15:31:59 skiff-4c2202a NetworkManager[417]: <info>  [1518640319.1123] NetworkManager (version 1.10.2) is starting... (after a restart)
Feb 14 15:31:59 skiff-4c2202a dbus-daemon[317]: [system] Activating systemd to hand-off: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.2' (uid=0 pid=417 comm="/usr/sbin/NetworkManager --no-daemon ")
Feb 14 15:31:59 skiff-4c2202a NetworkManager[417]: <info>  [1518640319.1126] Read config: /etc/NetworkManager/NetworkManager.conf
Feb 14 15:32:24 skiff-4c2202a dbus-daemon[317]: [system] Failed to activate service 'org.freedesktop.hostname1': timed out (service_start_timeout=25000ms)
Feb 14 15:32:24 skiff-4c2202a dbus-daemon[317]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Feb 14 15:32:24 skiff-4c2202a NetworkManager[417]: <info>  [1518640344.1621] hostname: hostname: hostnamed not used as proxy creation failed with: Error calling StartServiceByName for org.freedesktop.hostname1: GDBus.Error:org.freedesktop.DBus.Error.TimedOut: Failed to activate service 'org.freedesktop.hostname1': timed out (service_start_timeout=25000ms)
Feb 14 15:32:24 skiff-4c2202a NetworkManager[417]: <info>  [1518640344.1625] hostname: hostname changed from (none) to "skiff-4c2202a"
...
paralin commented 6 years ago

I found a relevant email from the mailing list:

When I updated systemd from 234 to 237 and D-Bus from 1.12.0 to 1.12.2
on my system with a read-only rootfs (buildroot 1c0c55c028 to buildroot
27d2229692), my D-Bus-activated services stopped working. Checking the
logs, it turned out that systemd PID 1 was not able to connect the D-Bus
socket at all. There were some recent changes in systemd where upstream
refactored code which waits for sockets to appear. However, the real
problem is that systemd is configured to look for the D-Bus socket in
/run/dbus, while D-Bus creates it at /var/run/dbus/. D-Bus upstream
explains in a bugreport [1] that this "traditional" /var/run/dbus is
going to stay because it's hardcoded in other independent
implementations of the D-Bus APIs.

As is also said in that bugreport, the root cause is that /run and
/var/run are effectively two separate directories on Buildroot -- at
least when configured for a R/O rootfs. Furthermore, systemd actually
actively warns about this:

  systemd[1]: System is tainted: var-run-bad

Looking further, systemd-tmpfiles also detects breakage:

  systemd-tmpfiles[172]: [/usr/lib/tmpfiles.d/var.conf:12] Duplicate line for path "/var/run", ignoring.
  systemd-tmpfiles[172]: [/usr/lib/tmpfiles.d/var.conf:14] Duplicate line for path "/var/log", ignoring.
  systemd-tmpfiles[172]: [/usr/lib/tmpfiles.d/var.conf:21] Duplicate line for path "/var/lib", ignoring.

This change simply skips /var/run from being copied from the
/usr/share/factory. The symlink is still created by another tmpfiles.d
entry which belongs to systemd.

The other warnings are still in present:

  systemd-tmpfiles[174]: [/usr/lib/tmpfiles.d/var.conf:14] Duplicate line for path "/var/log", ignoring.
  systemd-tmpfiles[174]: [/usr/lib/tmpfiles.d/var.conf:21] Duplicate line for path "/var/lib", ignoring.

I'm leaving that one to someone who is more familiar with systemd and
buildroot conventions. My box now boots again, so I'm happy :).

[1] https://bugs.freedesktop.org/show_bug.cgi?id=101628

Signed-off-by: Jan Kundrát <jan.kundrat@cesnet.cz>
---
paralin commented 6 years ago
[   15.821143] systemd[1]: System is tainted: var-run-bad
[   15.850847] systemd[1]: Started D-Bus System Message Bus.
[   15.880658] systemd[1]: Failed to connect to system bus: No such file or directory
[   15.910663] systemd[1]: Failed to initialize D-Bus connection: No such file or directory
paralin commented 6 years ago

After applying the patch: Unfortunately, it seems the problem is still occurring. Maybe it's a different problem?

 [   16.650294] systemd[1]: System is tainted: var-run-bad
 [   16.680724] systemd[1]: Starting Network Connectivity...
 [   16.710421] systemd[1]: Started D-Bus System Message Bus.
 [   16.740678] systemd[1]: Failed to connect to system bus: No such file or directory
 [   16.770425] systemd[1]: Failed to initialize D-Bus connection: No such file or directory
 [   16.800386] systemd[1]: Starting Network Manager...

 [1518648890.7497] hostname: hostname: hostnamed not used as proxy creation failed

 [   39.020736] dbus-daemon[314]: [system] Failed to activate service 'org.freedesktop.systemd1': timed
 out
 [   39.100451] dbus-daemon[314]: [system] Activating systemd to hand-off: service
 name='org.freedesktop.nm
 _dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.0' (uid=0 pid=315
 comm="/u
 sr/sbin/NetworkManager --no-daemon ")                                                                    
 [   39.150403] NetworkManager[315]: <info>  [1518648890.7540] manager[0x20ac030]: rfkill: WWAN hardware

It seems that systemd still cannot talk to the system bus.

I can also see this happening if I try to execute systemd-hostnamed:

 # ./systemd-hostnamed 
 Failed to get system bus connection: No such file or directory

 # ls /var/run/
 NetworkManager  dhcpcd.pid          docker       ifstate   utmp
 dbus            dhcpcd.sock         docker.pid   sepermit
 dhcpcd          dhcpcd.unpriv.sock  docker.sock  sshd.pid

 # ls /var/run/dbus
 system_bus_socket

 # ls /run/
 blkid   lock  mount     systemd  user  xtables.lock
 docker  log   ntpd.pid  udev     utmp

If I link dbus like so:

 # ln -s /var/run/dbus/ /run/dbus

It seems then that everything works fine.

paralin commented 6 years ago

Fixed

LorenzoAncora commented 5 years ago

If I link dbus like so:

ln -s /var/run/dbus/ /run/dbus

It seems then that everything works fine.

This solution fixes the issue also on Debian armhf.

insaner commented 5 years ago

ln -s /run/dbus /var/run/dbus

Fixed it for me on fedora 27. And my hair is now starting to grow back!

paralin commented 5 years ago

@insaner glad to hear it fixed for you, check out Skiff and Buildroot while you're passing by!

insaner commented 5 years ago

Wow, cool! That is definitely something of interest to me! Especially the cross-compiling part.. always something I've had to jump through acrobatic hoops to accomplish. Things like repackaging glib so I can have both the 32bit and 64bit installed side-by-side in Fedora just to compile my kernel into 64bit. Thanks for the heads up, I will be checking it out!

paralin commented 5 years ago

@insaner I'm interested now in developing out the desktop story for Skiff. Having a lightweight, custom tuned kernel for the hardware + a VM engine (KVM) to run the userspace (Windows, Fedora, or more Skiff containers) is by my estimation a better way to build an OS than a heavy core. This is because the lightweight core can be swapped out at runtime without reboot, and can be updated independently from the userspace libraries.

Something that would be cool to look at is customizing Manjaro's "architect" tool to install a micro-system like Skiff. (If anyone is interested in taking a crack at it!)

insaner commented 5 years ago

That does indeed sound amazing, and is something I have thought for at least a decade (if not 2) that it would be amazing to swap out the kernel at runtime.. without breaking anything Keep me posted.. I have a hard time adopting new technologies (workflow rigidity) but that sounds like a great project and idea!