dbus2 / zbus-old

Rust D-Bus crate.
https://gitlab.freedesktop.org/dbus/zbus
Other
49 stars 13 forks source link

Getting a property change sometimes blocks forever #320

Closed zeenix closed 1 year ago

zeenix commented 1 year ago

In GitLab by @alixe1 on Jan 31, 2023, 09:13

Hello, thanks for the wonderful library. I've been having an issue where things freeze up when using zbus. I narrowed this down to PropertyChanged::get calls blocking forever. This only seems to happen sometimes, but is much more common when using a multi-threaded tokio runtime.

This piece of code reproduces the issue consistently for me and will get stuck forever:

use futures_util::StreamExt;
use zbus::{zvariant::OwnedObjectPath, Connection};

#[zbus::dbus_proxy(
    interface = "net.connman.iwd.Station",
    default_service = "net.connman.iwd"
)]
trait Station {
    #[dbus_proxy(property)]
    fn connected_network(&self) -> zbus::Result<OwnedObjectPath>;
}

#[tokio::main]
async fn main() {
    let connection = Connection::system().await.unwrap();

    let station = StationProxy::builder(&connection)
        .path("/net/connman/iwd/0/33")
        .unwrap()
        .build()
        .await
        .unwrap();

    let mut changes = station.receive_connected_network_changed().await;

    while let Some(change) = changes.next().await {
        println!("getting change");
        let _result = change.get().await;
        println!("got change");
    }
}

When I change WiFi networks a couple of times, this gets stuck forever waiting for the change.get() call to resolve. If I change this example to use a single threaded tokio executor it never gets stuck, so I'm guessing this isn't expected behavior.

zeenix commented 1 year ago

Oh my, this is bad. :disappointed:

Before I manage to take a look, could you kindly confirm that you're using the latest release (3.8) of zbus? If so, can you check if downgrading to 3.6 help (I'm thinking some recent refactors might have caused a regression)?

zeenix commented 1 year ago

Oh and if you could provide the output of busctl monitor net.connman.iwd --match "interface='org.freedesktop.DBus.Properties'" when the hang happens, that'd be very helpful. I don't have a connman on my machine so if I know what D-Bus traffic triggers the issue, I can then hopefully reproduce it locally.

zeenix commented 1 year ago

In GitLab by @alixe1 on Jan 31, 2023, 12:50

I've just tried with a few versions (3.6, 3.7 and 3.8). They all have this issue 😞

zeenix commented 1 year ago

In GitLab by @alixe1 on Jan 31, 2023, 13:00

Sure! I ran the example rust code, then ran that command, then changed my wifi network once. This is the output (not sure if this contains a lot of noise, sorry):

zbus.log

zeenix commented 1 year ago

Thanks so much. That's very useful. There is a lot of noise yeah. When the output/log is too big, it's best to save to a file and upload that.

zeenix commented 1 year ago

In GitLab by @alixe1 on Jan 31, 2023, 13:10

Makes sense, I've edited my comment so this thread stays readable 😅

zeenix commented 1 year ago

Oh and you're enabling the tokio feature of zbus, right?

zeenix commented 1 year ago

So based on your code, I created this code:

use futures_util::StreamExt;
use zbus::{zvariant::OwnedObjectPath, ConnectionBuilder};

struct Station;

#[zbus::dbus_interface(name = "net.connman.iwd.Station")]
impl Station {
    #[dbus_interface(property)]
    fn connected_network(&self) -> OwnedObjectPath {
        OwnedObjectPath::try_from("/net/connman/iwd/0/33/Network/0").unwrap()
    }
}

#[zbus::dbus_proxy(
    interface = "net.connman.iwd.Station",
    default_service = "net.connman.iwd"
)]
trait Station {
    #[dbus_proxy(property)]
    fn connected_network(&self) -> zbus::Result<OwnedObjectPath>;
}

#[tokio::main]
async fn main() {
    let connection = ConnectionBuilder::session()
        .unwrap()
        .serve_at("/net/connman/iwd/0/33", Station)
        .unwrap()
        .name("net.connman.iwd")
        .unwrap()
        .build()
        .await
        .unwrap();
    let conn_clone = connection.clone();
    tokio::spawn(async move {
        loop {
            let iface_ref = conn_clone
                .object_server()
                .interface::<_, Station>("/net/connman/iwd/0/33")
                .await
                .unwrap();
            let iface = iface_ref.get().await;
            iface
                .connected_network_changed(iface_ref.signal_context())
                .await
                .unwrap();
            tokio::time::sleep(std::time::Duration::from_millis(100)).await;
        }
    });

    let station = StationProxy::builder(&connection)
        .path("/net/connman/iwd/0/33")
        .unwrap()
        .build()
        .await
        .unwrap();

    let mut changes = station.receive_connected_network_changed().await;

    let mut i = 1;
    while let Some(change) = changes.next().await {
        println!("{i}: getting change");
        let _result = change.get().await;
        println!("{i}: got change");
        i += 1;
    }
}

But it never hangs:

1007: getting change
1007: got change
1008: getting change
1008: got change

Looking at the dbus traffic log you kindly provided show that connman first invalidates the property only to signal the availability of value once again. This is super silly tbh and most likely non-compliant behavior so I'm not surprised our code gets confused. However, we shouldn't be blocking forever at least. That's on us.

I'll try and reproduce the exact behavior of connman soon. If I can reproduce the issue, it'd be much easier to be able to fix it.

zeenix commented 1 year ago

In GitLab by @alixe1 on Feb 1, 2023, 24:55

Oh and you're enabling the tokio feature of zbus, right?

Yes

zeenix commented 1 year ago

Great news! After I added invalidation before sending, I can reproduce this issue very easily now. This makes it much easier to debug and fix..

zeenix commented 1 year ago

@alixe1 !648 fixes this. Please test and confirm.

zeenix commented 1 year ago

In GitLab by @alixe2 on Feb 4, 2023, 03:01

Yes, everything works great now, thank you!

zeenix commented 1 year ago

Cool. Thanks for reporting this and testing the fix. I'll roll out a release soon.

zeenix commented 1 year ago

mentioned in commit ab423a9d07b511f6275650d4450a33e302b79d83