hypfvieh / dbus-java

Improved version of java DBus library provided by freedesktop.org (https://dbus.freedesktop.org/doc/dbus-java/)
https://hypfvieh.github.io/dbus-java/
MIT License
185 stars 73 forks source link

Synchronization issue with AbstractDBusConnection.getHandledSignals #97

Closed lbeuster closed 4 years ago

lbeuster commented 4 years ago

With the current master if have indeterministic synchronization issues resulting in 'org.freedesktop.dbus.errors.NoReply: No reply within specified time'. This issue came with commit cfa0e0f8932dbd15aff760aeed1e244a2ce037b6.

The problem is the synchronization of getHandledSignals in DBusConnection.addSigHandler. While holding the lock on the handlers a remote call (dbus.AddMatch) is executed (in the previous commit this call was outside of the sync-block). At the same time a NameAcquired messages/signal NameAcquired in that also needs the lock on the handlers - but the signal has to wait because it is hold by addSigHandler. Seems like the server NameAcquired-signal and the clients AddMatch request block each other.

Here are some log statements showing the problem

Create Connection
1713 [DBusConnection] : -> handleMessage, acquiring sync, DBusSignal(1,2) { Path=>/org/freedesktop/DBus, Interface=>org.freedesktop.DBus, Member=>NameAcquired, Destination=>:1.1591, Sender=>org.freedesktop.DBus, Signature=>s } [:1.1591]
1714 [main]           : -> addSignalHandler [:1.1591], acquiring sync
1714 [main]           :    addSignalHandler: got sync
... MethodCall.REPLY_WAIT_TIMEOUT=2 sec (but it's the same issue with a higher value)
3721 [main]           : <- ERROR addSigHandler  1584485323721 main [:1.1591] -> released sync
                           Caused by: org.freedesktop.dbus.errors.NoReply: No reply within specified time
                               at org.freedesktop.dbus.RemoteInvocationHandler.executeRemoteMethod(RemoteInvocationHandler.java:160)
                               at org.freedesktop.dbus.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:228)
                               at com.sun.proxy.$Proxy29.AddMatch(Unknown Source)
                               at org.freedesktop.dbus.connections.impl.DBusConnection.addSigHandler(DBusConnection.java:783)
3722 [DBusConnection] :    handleMessage, got sync [:1.1591]
3722 [DBusConnection] : <- handleMessage, released sync [:1.1591]

I have tested inside a Ubuntu 18.04 VM with 4 processors.

hypfvieh commented 4 years ago

I'm pretty sure that the sychronized blocks are not needed because the underlying map is already a ConcurrentHashMap.

I added a new branch which removes the sychronized blocks and will also change the value of the signal lists from List<> to Queue<>, so the value should also be threadsafe like the map.

Please try the changes here: https://github.com/hypfvieh/dbus-java/tree/no-sychronized I'll merge it upstream if we are sure that this will not cause other issues.

lbeuster commented 4 years ago

The map may be thread safe for single method calls, but you still need sync across several calls, e.g. addSigHandler

sync {
    get()
    doSomething
    put/add
}

And beside this the access to the dbusSignalList (ArrayList) is also not synchronized.

lbeuster commented 4 years ago

Maybe Map.computeXXX/putIfAbsent will help.

hypfvieh commented 4 years ago

That is what the old code was doing, synchronize while some work is performed.

I don't think the get() needs to be in the synchronized block, because get() on ConcurrentHashMap will be atomic.

I was already thinking of the computeXX methods but this does not work when throwing exceptions. Every time e.g. addMatch is called a DBusExceptionException maybe thrown. Catching, wrapping and re-throwing is something I dont want to do.

hypfvieh commented 4 years ago

I've pushed some changes to the new branch which will use computeIfAbsent to create a new collection. Calling addMatch is performed outside of the lambda so the exception does not have to be caught and rethrown.

lbeuster commented 4 years ago

That was the point, the addMatch must be outside of any synchronization code ;)

lbeuster commented 4 years ago

Thanks for the quick fix - works for me.