bus1 / dbus-broker

Linux D-Bus Message Broker
https://github.com/bus1/dbus-broker/wiki
Apache License 2.0
661 stars 78 forks source link

dbus-broker.service timeout when there is not enough entropy #319

Closed agners closed 1 year ago

agners commented 1 year ago

On an Arm embedded platform (Rockchip RK3568 based) I came across the situation that starting of the dbus-broker.service timed out due to insufficient entropy (/proc/sys/kernel/random/entropy_avail was around 50 at the time of the service start, and did not fill up until 90s passed). I've been using libexpat 2.5.0, so the most recent version.

Reordering the service after systemd-random-seed.service helped to delay dbus-broker.service until enough entropy was available. It seems that is a recommended way of handling services which require randomness, from systemd-random-seed.service documentation:

This new seed is retrieved synchronously from the kernel, which means the service will not complete start-up until the random pool is fully initialized. On entropy-starved systems this may take a while. This functionality is intended to be used as synchronization point for ordering services that require an initialized entropy pool to function securely (i.e. services that access /dev/urandom without any further precautions).

Should there be an After=systemd-random-seed.service in the dbus-broker.service file?

However, I've noticed that other services which require/order after dbus.socket (e.g. when using BusName=) still fail since they do not wait for dbus-broker.service... Not sure if/how that can be properly fixed, this is probably more a systemd issue.

dvdhrm commented 1 year ago

Do you know where exactly it timed out? We use AT_RANDOM from auxval exclusively for this reason. This might be in some dependency like sd-bus or libselinux. Maybe we can avoid any getrandom() call.

I doubt that delaying dbus for >90s during boot would be an acceptable experience. So yeah, preferably, we just avoid any of those blocking calls.

dvdhrm commented 1 year ago

From the linked bug-report this seems to be in libexpat. I think we could use XML_SetHashSalt() to provide AT_RANDOM as salt for libexpat, and thus avoid blocking on /dev/urandom reads.

agners commented 1 year ago

From the linked bug-report this seems to be in libexpat. I think we could use XML_SetHashSalt() to provide AT_RANDOM as salt for libexpat, and thus avoid blocking on /dev/urandom reads.

I am not 100% sure about libexpat being the culprit, but it seems very likely.

Thanks for looking into this! I'll test the patch you've created on the platform which is affected, then we know for sure if that was indeed the culprit!

agners commented 1 year ago

Even with #320 applied, it seems that dbus-broker-launch still blocks:

~ # ps -p $(pidof dbus-broker-launch) -ocmd,stat,wchan
CMD                         STAT WCHAN
/usr/bin/dbus-broker-launch Ss   wait_for_random_bytes

I'll try to get a stack trace.

agners commented 1 year ago

Unfortunately I don't have all the debugging symbols at hand but maybe this already helps to some degree:

(gdb) bt
#0  0x0000ffff84abac2c in ?? () from target:/lib64/libc.so.6
#1  0x0000ffff84cffabc in ?? () from target:/lib64/libexpat.so.1
#2  0x0000ffff84d01cc0 in XML_Parse () from target:/lib64/libexpat.so.1
#3  0x0000aaaaea51c0c0 in config_parser_include ()
#4  0x0000aaaaea51ea4c in config_parser_read ()
#5  0x0000aaaaea513fac in launcher_parse_config ()
#6  0x0000aaaaea5169ec in launcher_run ()
#7  0x0000aaaaea513a94 in main ()
agners commented 1 year ago

@dvdhrm this time with symbols

(gdb) bt
#0  0x0000ffffb540fc2c in ?? () from target:/lib64/libc.so.6
#1  0x0000ffffb5654abc in generate_hash_secret_salt (parser=0xaaaaf49cd270) at xmlparse.c:908
#2  startParsing (parser=0xaaaaf49cd270) at xmlparse.c:955
#3  0x0000ffffb5656cc0 in XML_Parse (parser=0xaaaaf49cd270, s=0xffffe8b95698 "<!-- This configuration file controls the systemwide message bus.\n     Add a system-local.conf and edit that rather than changing this\n     file directly. -->\n\n<!-- Note that there are any number of w"..., len=4096, isFinal=0)
    at xmlparse.c:1842
#4  0x0000aaaad49570c0 in config_parser_include ()
#5  0x0000aaaad4959a4c in config_parser_read ()
#6  0x0000aaaad494efac in launcher_parse_config ()
#7  0x0000aaaad49519ec in launcher_run ()
#8  0x0000aaaad494ea94 in main ()
dvdhrm commented 1 year ago

Thanks a lot for the report and fast replies! This is merged and I will prepare a release now (trying to get the other outstanding PRs in as well).