Closed agners closed 1 year ago
Do you know where exactly it timed out? We use AT_RANDOM
from auxval exclusively for this reason. This might be in some dependency like sd-bus or libselinux. Maybe we can avoid any getrandom() call.
I doubt that delaying dbus for >90s during boot would be an acceptable experience. So yeah, preferably, we just avoid any of those blocking calls.
From the linked bug-report this seems to be in libexpat. I think we could use XML_SetHashSalt()
to provide AT_RANDOM
as salt for libexpat, and thus avoid blocking on /dev/urandom
reads.
From the linked bug-report this seems to be in libexpat. I think we could use
XML_SetHashSalt()
to provideAT_RANDOM
as salt for libexpat, and thus avoid blocking on/dev/urandom
reads.
I am not 100% sure about libexpat being the culprit, but it seems very likely.
Thanks for looking into this! I'll test the patch you've created on the platform which is affected, then we know for sure if that was indeed the culprit!
Even with #320 applied, it seems that dbus-broker-launch
still blocks:
~ # ps -p $(pidof dbus-broker-launch) -ocmd,stat,wchan
CMD STAT WCHAN
/usr/bin/dbus-broker-launch Ss wait_for_random_bytes
I'll try to get a stack trace.
Unfortunately I don't have all the debugging symbols at hand but maybe this already helps to some degree:
(gdb) bt
#0 0x0000ffff84abac2c in ?? () from target:/lib64/libc.so.6
#1 0x0000ffff84cffabc in ?? () from target:/lib64/libexpat.so.1
#2 0x0000ffff84d01cc0 in XML_Parse () from target:/lib64/libexpat.so.1
#3 0x0000aaaaea51c0c0 in config_parser_include ()
#4 0x0000aaaaea51ea4c in config_parser_read ()
#5 0x0000aaaaea513fac in launcher_parse_config ()
#6 0x0000aaaaea5169ec in launcher_run ()
#7 0x0000aaaaea513a94 in main ()
@dvdhrm this time with symbols
(gdb) bt
#0 0x0000ffffb540fc2c in ?? () from target:/lib64/libc.so.6
#1 0x0000ffffb5654abc in generate_hash_secret_salt (parser=0xaaaaf49cd270) at xmlparse.c:908
#2 startParsing (parser=0xaaaaf49cd270) at xmlparse.c:955
#3 0x0000ffffb5656cc0 in XML_Parse (parser=0xaaaaf49cd270, s=0xffffe8b95698 "<!-- This configuration file controls the systemwide message bus.\n Add a system-local.conf and edit that rather than changing this\n file directly. -->\n\n<!-- Note that there are any number of w"..., len=4096, isFinal=0)
at xmlparse.c:1842
#4 0x0000aaaad49570c0 in config_parser_include ()
#5 0x0000aaaad4959a4c in config_parser_read ()
#6 0x0000aaaad494efac in launcher_parse_config ()
#7 0x0000aaaad49519ec in launcher_run ()
#8 0x0000aaaad494ea94 in main ()
Thanks a lot for the report and fast replies! This is merged and I will prepare a release now (trying to get the other outstanding PRs in as well).
On an Arm embedded platform (Rockchip RK3568 based) I came across the situation that starting of the
dbus-broker.service
timed out due to insufficient entropy (/proc/sys/kernel/random/entropy_avail
was around 50 at the time of the service start, and did not fill up until 90s passed). I've been using libexpat 2.5.0, so the most recent version.Reordering the service after
systemd-random-seed.service
helped to delaydbus-broker.service
until enough entropy was available. It seems that is a recommended way of handling services which require randomness, from systemd-random-seed.service documentation:Should there be an
After=systemd-random-seed.service
in thedbus-broker.service
file?However, I've noticed that other services which require/order after
dbus.socket
(e.g. when usingBusName=
) still fail since they do not wait fordbus-broker.service
... Not sure if/how that can be properly fixed, this is probably more a systemd issue.