bus1 / dbus-broker

Linux D-Bus Message Broker
https://github.com/bus1/dbus-broker/wiki
Apache License 2.0
677 stars 79 forks source link

dbus-broker-launch[9179]: ERROR listener_dispatch @ ../src/bus/listener.c +42: Bad file descriptor #330

Closed jstrausd closed 2 months ago

jstrausd commented 9 months ago

Hello, since a few days my server with AlmaLinux 9 randomly started making problems. After a reboot NetworkManager and systemd-login and d-bus can't start or logged some errors.

Dec 20 11:25:59 host systemd[1]: Starting D-Bus System Message Bus... Dec 20 11:25:59 host systemd[1]: Started D-Bus System Message Bus. Dec 20 11:25:59 host dbus-broker-launch[9179]: ERROR listener_dispatch @ ../src/bus/listener.c +42: Bad file descriptor Dec 20 11:25:59 host dbus-broker-launch[9179]: dispatch_context_dispatch @ ../src/util/dispatch.c +344 Dec 20 11:25:59 host dbus-broker-launch[9179]: broker_run @ ../src/broker/broker.c +225 Dec 20 11:25:59 host dbus-broker-launch[9179]: run @ ../src/broker/main.c +261 Dec 20 11:25:59 host dbus-broker-launch[9179]: main @ ../src/broker/main.c +295 Dec 20 11:25:59 host dbus-broker-lau[9178]: Ready Dec 20 11:25:59 host dbus-broker[9179]: Dispatched 0 messages @ 0(±0)μs / message. Dec 20 11:25:59 host systemd[1]: NetworkManager.service: Unexpected error response from GetNameOwner(): Connection terminated Dec 20 11:25:59 host systemd[1]: polkit.service: Unexpected error response from GetNameOwner(): Connection terminated Dec 20 11:25:59 host systemd[1]: tuned.service: Unexpected error response from GetNameOwner(): Connection terminated Dec 20 11:25:59 host dbus-broker-launch[9178]: Caught SIGCHLD of broker.

Is there any way to fix my problem with dbus/dbus-broker? I tried several things I found online, but nothing seemed to help...

dvdhrm commented 9 months ago

What version of dbus-broker are you running?

jstrausd commented 9 months ago

[root@host ~]# dbus-broker --v dbus-broker 28

[root@host ~]# dbus-broker-launch --version dbus-broker-launch 28

I saw the latest version is 35, how to properly update?

jstrausd commented 9 months ago
image

When trying to build the newest version, 5 tests are failing with the same error as the installed version described above.

dvdhrm commented 9 months ago

Apparently accept4(2) returns EBADF on the D-Bus listener socket. This sounds like either a wrong systemd-unit-setup or a bad kernel update. Does this happen with a clean AlmaLinux9 install?

jstrausd commented 9 months ago

No the system is round about 2 month 24/7 in use. I also had the same problem 2 month ago and then had to reinstall. Now it's time to find the cause. It happend out of nowhere. So do you recommend update kernel?

teg commented 9 months ago

No the system is round about 2 month 24/7 in use. I also had the same problem 2 month ago and then had to reinstall. Now it's time to find the cause. It happend out of nowhere. So do you recommend update kernel?

How long had the old system been running, also around two months?

Do you have any custom config of either systemd or dbus?

jstrausd commented 9 months ago

Yes, it's about the same duration the server lasted. I just use DirectAdmin on it for Webhosting and Docker, with no custom configuration of the system itself.

dvdhrm commented 8 months ago

This is really strange, especially given that you are the only one hitting this so far. I really suspect that some other component is at fault here, but that is always easy to say.

The thing is, the error indicates that the socket that is created by systemd and then passed in as D-Bus listener socket to dbus-broker, somehow is not valid. And this is the early startup code of dbus-broker, so not much can go wrong before this happens. And even then, it sounds highly unlikely that dbus-broker is at fault if a faulty socket is passed in. But cannot be sure.

I would expect this error to go away with a clean install of an up-to-date Fedora/Alma/..., but you are saying it does not, right?

DimeOne commented 8 months ago

I'm having the same issue on Arch linux after upgrade, but only on my Cloud vserver.

$ dbus-broker --v dbus-broker 35

$ systemctl status dbus ● dbus-broker.service - D-Bus System Message Bus Loaded: loaded (/usr/lib/systemd/system/dbus-broker.service; static) Active: active (running) since Thu 2024-01-18 20:59:33 CET; 50s ago TriggeredBy: ● dbus.socket Docs: man:dbus-broker-launch(1) Main PID: 1513 (dbus-broker-lau) Tasks: 1 (limit: 19176) Memory: 996.0K (peak: 3.0M) CPU: 30ms CGroup: /system.slice/dbus-broker.service └─1513 /usr/bin/dbus-broker-launch --scope system --audit

Jan 18 20:59:33 vs0101 systemd[1]: Starting D-Bus System Message Bus... Jan 18 20:59:33 vs0101 systemd[1]: Started D-Bus System Message Bus. Jan 18 20:59:33 vs0101 dbus-broker-launch[1515]: ERROR listener_dispatch @ ../dbus-broker-35/src/bus/listener.c +42: Bad file descriptor Jan 18 20:59:33 vs0101 dbus-broker-launch[1515]: dispatch_context_dispatch @ ../dbus-broker-35/src/util/dispatch.c +344 Jan 18 20:59:33 vs0101 dbus-broker-launch[1515]: broker_run @ ../dbus-broker-35/src/broker/broker.c +229 Jan 18 20:59:33 vs0101 dbus-broker-launch[1515]: run @ ../dbus-broker-35/src/broker/main.c +261 Jan 18 20:59:33 vs0101 dbus-broker-launch[1515]: main @ ../dbus-broker-35/src/broker/main.c +295 Jan 18 20:59:33 vs0101 dbus-broker[1515]: Dispatched 0 messages @ 0(±0)μs / message. Jan 18 20:59:33 vs0101 dbus-broker-launch[1513]: Ready Jan 18 20:59:33 vs0101 dbus-broker-launch[1513]: Caught SIGCHLD of broker.

$ journalctl --since 18:00 | grep Unexpected

Jan 18 20:59:33 vs0101 systemd[1]: systemd-logind.service: Unexpected error response from GetNameOwner(): Connection terminated Jan 18 20:59:33 vs0101 systemd[1]: systemd-networkd.service: Unexpected error response from GetNameOwner(): Connection terminated Jan 18 20:59:33 vs0101 systemd[1]: firewalld.service: Unexpected error response from GetNameOwner(): Connection terminated Jan 18 20:59:33 vs0101 systemd[1]: polkit.service: Unexpected error response from GetNameOwner(): Connection terminated Jan 18 20:59:33 vs0101 systemd[1]: NetworkManager.service: Unexpected error response from GetNameOwner(): Connection terminated Jan 18 20:59:33 vs0101 systemd[1]: systemd-resolved.service: Unexpected error response from GetNameOwner(): Connection terminated Jan 18 20:59:33 vs0101 systemd[1]: systemd-logind.service: Unexpected error response from GetNameOwner(): Connection terminated Jan 18 20:59:33 vs0101 systemd[1]: systemd-networkd.service: Unexpected error response from GetNameOwner(): Connection terminated Jan 18 20:59:33 vs0101 systemd[1]: firewalld.service: Unexpected error response from GetNameOwner(): Connection terminated

$ pacman -Qikk dbus-broker dbus-broker-units

Name : dbus-broker Version : 35-2 Description : Linux D-Bus Message Broker Architecture : x86_64 URL : https://github.com/bus1/dbus-broker/wiki Licenses : Apache-2.0 Groups : None Provides : None Depends On : audit libcap-ng expat systemd-libs libaudit.so=1-64 libcap-ng.so=0-64 libexpat.so=1-64 libsystemd.so=0-64 Optional Deps : None Required By : dbus-broker-units Optional For : None Conflicts With : None Replaces : None Installed Size : 365.64 KiB Packager : Jan Alexander Steffens (heftig) heftig@archlinux.org Build Date : Fri 05 Jan 2024 03:45:05 PM CET Install Date : Tue 16 Jan 2024 07:18:16 PM CET Install Reason : Installed as a dependency for another package Install Script : No Validated By : Signature

dbus-broker: 18 total files, 0 altered files Name : dbus-broker-units Version : 35-2 Description : Linux D-Bus Message Broker - Service units Architecture : x86_64 URL : https://github.com/bus1/dbus-broker/wiki Licenses : Apache-2.0 Groups : None Provides : dbus-units Depends On : dbus-broker Optional Deps : None Required By : systemd Optional For : None Conflicts With : dbus-daemon-units Replaces : None Installed Size : 0.00 B Packager : Jan Alexander Steffens (heftig) heftig@archlinux.org Build Date : Fri 05 Jan 2024 03:45:05 PM CET Install Date : Tue 16 Jan 2024 07:18:16 PM CET Install Reason : Installed as a dependency for another package Install Script : No Validated By : Signature

dbus-broker-units: 7 total files, 0 altered files

dvdhrm commented 8 months ago

@jstrausd @DimeOne I am trying to trace this down. On an affected machine, can you clone the dbus-broker repository and run the test-suite? And, if it shows the mentioned error, can you post the meson error logs? And lastly, if possible, can you run one of the affected tests via strace -f ./<build>/test/dbus/<some-test>?

Something like this:

git clone https://github.com/bus1/dbus-broker.git
cd dbus-broker
make meson-setup
make meson-build
make meson-test

cd ./build/meson
./test/dbus/test-matches
strace -f ./test/dbus/test-matches

(You might have to install strace on your machine first. This should be packaged on almost every distribution.)

rafaelwastaken commented 8 months ago

Also experiencing this on an Arch VM running on Hetzner Cloud. I ran the tests above on the affected VM, here's output from both the test and strace: https://gist.github.com/rafaelwastaken/2582a88888330243a5ea25207ae5fe48

dvdhrm commented 7 months ago

Thanks a lot for submitting the information! Unfortunately, I picked the wrong directory to run the test from (it runs dbus-broker from a relative path). Can you try this instead (I also edited the previous message):

cd ./build/meson
./test/dbus/test-matches
strace -f ./test/dbus/test-matches
rafaelwastaken commented 6 months ago

Sorry for the delayed response. Re-ran the new commands, output here: https://gist.github.com/rafaelwastaken/7d1e097aa56e15a8632c0e05957378a9

dvdhrm commented 6 months ago

Thanks a lot for the trace! I think the relevant piece is this:

[pid 475832] accept4(8, NULL, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK <unfinished ...>
[pid 475832] <... accept4 resumed>)     = -1 EAGAIN (Resource temporarily unavailable)
[pid 475832] getsockopt(-1, SOL_SOCKET, SO_TYPE,  <unfinished ...>
[pid 475832] <... getsockopt resumed>0x7ffda8d59b70, [4]) = -1 EBADF (Bad file descriptor)
[pid 475832] write(2, "ERROR listener_dispatch @ ../../"..., 76 <unfinished ...>
[pid 475832] <... write resumed>)       = 76

If we sanitize this, it becomes:

accept4(8, NULL, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK)     = -1 EAGAIN (Resource temporarily unavailable)
getsockopt(-1, SOL_SOCKET, SO_TYPE, 0x7ffda8d59b70, [4]) = -1 EBADF (Bad file descriptor)
write(2, "ERROR listener_dispatch @ ../../"..., 76)       = 76

I have no clue where this getsockopt() call comes from. I also don't know why it completely confuses the error-handler to pretend it is called from listener_dispatch().

A few followup questions:

1) Is this reproducible when compiling without optimizations? That is, using --buildtype plain (or debug) instead of --buildtype debugoptimized? Is this reproducible when compiling with clang instead of gcc?

2) Do you run any non-standard system? Things like non-glibc, or non-systemd, or non-upstream-kernel?

3) Do you run any non-standard intrusive utilities that use LD_PRELOAD or similar to inject code? Or nss-modules?

dvdhrm commented 6 months ago

There is a lot of suspicious activity in that strace. There is access on /tmp/.apid, there is an existing /etc/ld.so.preload, there is a load of /lib/libgcwrap.so.

Can you look into /etc/ld.so.preload and check whether it is correct? Can you check which software installed /lib/libgcwrap.so?

wQuick commented 3 months ago

Please ... I have this error

ERROR listener_dispatch @ ../dbus-broker-36/src/bus/listener.c +42: Bad file descriptor dispatch_context_dispatch @ ../dbus-broker-36/src/util/dispatch.c +344 broker_run @ ../dbus-broker-36/src/broker/broker.c +229 run @ ../dbus-broker-36/src/broker/main.c +261 main @ ../dbus-broker-36/src/broker/main.c +295 test-matches: ../../test/dbus/util-broker.c:757: util_broker_consume_signal: Assertionfalse && "r > 0"' failed. [1] 456404 IOT instruction (core dumped) ./test/dbus/test-matches`

dvdhrm commented 2 months ago

Ok, all traces I got of this problem include very suspicious loads of libraries via /etc/ld.so.preload. This is very likely due to a malware infection. Please check your system! Especially if an strace shows things like /tmp/.apid or loading of suspicious libraries via ld.so.preload, please ensure that your system is not compromised. I also recommend searching the web for similar issues.

I will close this report now, as I see no indication of an issue in dbus-broker. If you see this error, and you are sure your system is not subject to a malware infection, please respond here, or open a separate report. Thanks a lot!