Closed paralin closed 6 years ago
I found a relevant email from the mailing list:
When I updated systemd from 234 to 237 and D-Bus from 1.12.0 to 1.12.2
on my system with a read-only rootfs (buildroot 1c0c55c028 to buildroot
27d2229692), my D-Bus-activated services stopped working. Checking the
logs, it turned out that systemd PID 1 was not able to connect the D-Bus
socket at all. There were some recent changes in systemd where upstream
refactored code which waits for sockets to appear. However, the real
problem is that systemd is configured to look for the D-Bus socket in
/run/dbus, while D-Bus creates it at /var/run/dbus/. D-Bus upstream
explains in a bugreport [1] that this "traditional" /var/run/dbus is
going to stay because it's hardcoded in other independent
implementations of the D-Bus APIs.
As is also said in that bugreport, the root cause is that /run and
/var/run are effectively two separate directories on Buildroot -- at
least when configured for a R/O rootfs. Furthermore, systemd actually
actively warns about this:
systemd[1]: System is tainted: var-run-bad
Looking further, systemd-tmpfiles also detects breakage:
systemd-tmpfiles[172]: [/usr/lib/tmpfiles.d/var.conf:12] Duplicate line for path "/var/run", ignoring.
systemd-tmpfiles[172]: [/usr/lib/tmpfiles.d/var.conf:14] Duplicate line for path "/var/log", ignoring.
systemd-tmpfiles[172]: [/usr/lib/tmpfiles.d/var.conf:21] Duplicate line for path "/var/lib", ignoring.
This change simply skips /var/run from being copied from the
/usr/share/factory. The symlink is still created by another tmpfiles.d
entry which belongs to systemd.
The other warnings are still in present:
systemd-tmpfiles[174]: [/usr/lib/tmpfiles.d/var.conf:14] Duplicate line for path "/var/log", ignoring.
systemd-tmpfiles[174]: [/usr/lib/tmpfiles.d/var.conf:21] Duplicate line for path "/var/lib", ignoring.
I'm leaving that one to someone who is more familiar with systemd and
buildroot conventions. My box now boots again, so I'm happy :).
[1] https://bugs.freedesktop.org/show_bug.cgi?id=101628
Signed-off-by: Jan Kundrát <jan.kundrat@cesnet.cz>
---
[ 15.821143] systemd[1]: System is tainted: var-run-bad
[ 15.850847] systemd[1]: Started D-Bus System Message Bus.
[ 15.880658] systemd[1]: Failed to connect to system bus: No such file or directory
[ 15.910663] systemd[1]: Failed to initialize D-Bus connection: No such file or directory
After applying the patch: Unfortunately, it seems the problem is still occurring. Maybe it's a different problem?
[ 16.650294] systemd[1]: System is tainted: var-run-bad
[ 16.680724] systemd[1]: Starting Network Connectivity...
[ 16.710421] systemd[1]: Started D-Bus System Message Bus.
[ 16.740678] systemd[1]: Failed to connect to system bus: No such file or directory
[ 16.770425] systemd[1]: Failed to initialize D-Bus connection: No such file or directory
[ 16.800386] systemd[1]: Starting Network Manager...
[1518648890.7497] hostname: hostname: hostnamed not used as proxy creation failed
[ 39.020736] dbus-daemon[314]: [system] Failed to activate service 'org.freedesktop.systemd1': timed
out
[ 39.100451] dbus-daemon[314]: [system] Activating systemd to hand-off: service
name='org.freedesktop.nm
_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.0' (uid=0 pid=315
comm="/u
sr/sbin/NetworkManager --no-daemon ")
[ 39.150403] NetworkManager[315]: <info> [1518648890.7540] manager[0x20ac030]: rfkill: WWAN hardware
It seems that systemd still cannot talk to the system bus.
I can also see this happening if I try to execute systemd-hostnamed:
# ./systemd-hostnamed
Failed to get system bus connection: No such file or directory
# ls /var/run/
NetworkManager dhcpcd.pid docker ifstate utmp
dbus dhcpcd.sock docker.pid sepermit
dhcpcd dhcpcd.unpriv.sock docker.sock sshd.pid
# ls /var/run/dbus
system_bus_socket
# ls /run/
blkid lock mount systemd user xtables.lock
docker log ntpd.pid udev utmp
If I link dbus like so:
# ln -s /var/run/dbus/ /run/dbus
It seems then that everything works fine.
Fixed
If I link dbus like so:
ln -s /var/run/dbus/ /run/dbus
It seems then that everything works fine.
This solution fixes the issue also on Debian armhf.
ln -s /run/dbus /var/run/dbus
Fixed it for me on fedora 27. And my hair is now starting to grow back!
@insaner glad to hear it fixed for you, check out Skiff and Buildroot while you're passing by!
Wow, cool! That is definitely something of interest to me! Especially the cross-compiling part.. always something I've had to jump through acrobatic hoops to accomplish. Things like repackaging glib so I can have both the 32bit and 64bit installed side-by-side in Fedora just to compile my kernel into 64bit. Thanks for the heads up, I will be checking it out!
@insaner I'm interested now in developing out the desktop story for Skiff. Having a lightweight, custom tuned kernel for the hardware + a VM engine (KVM) to run the userspace (Windows, Fedora, or more Skiff containers) is by my estimation a better way to build an OS than a heavy core. This is because the lightweight core can be swapped out at runtime without reboot, and can be updated independently from the userspace libraries.
Something that would be cool to look at is customizing Manjaro's "architect" tool to install a micro-system like Skiff. (If anyone is interested in taking a crack at it!)
That does indeed sound amazing, and is something I have thought for at least a decade (if not 2) that it would be amazing to swap out the kernel at runtime.. without breaking anything Keep me posted.. I have a hard time adopting new technologies (workflow rigidity) but that sounds like a great project and idea!
This may be causing issues with NetworkManager as well. NetworkManager seems to not be marked as started up (it uses dbus) so it is restarted continuously, messing up networking.
Systemd Version
Other Versions
Logs