xrucka / bluetooth-manager-dbus

Bluetooth transport implementation bypassing tinyb and using directly dbus binding instead
Apache License 2.0
1 stars 2 forks source link

Unable to enumerate bluetooth devices #6

Open vkolotov opened 6 years ago

vkolotov commented 6 years ago

This is one is a major. Once it has happened, you have to restart OH to get rid of it.

This happens after I enable "connection control" for a device (particularly to Xiaomi mi flora).

It tries to connect to that device, I can see that it also tries to do authentication (via setting a pin code) and then Bluez reports infamous Software caused connection abort error and then it starts failing with this:

14:07:06.022 [WARN ] [oth.manager.impl.BluetoothManagerImpl] - Device discovery job error
cz.organovabanka.bluetooth.manager.transport.dbus.BluezException: Unable to enumerate bluetooth objects when processing /org/bluez/hci0
    at cz.organovabanka.bluetooth.manager.transport.dbus.BluezAdapter.getDevices(BluezAdapter.java:253) ~[?:?]
    at cz.organovabanka.bluetooth.manager.transport.dbus.BluezFactory.lambda$getDiscoveredDevices$5(BluezFactory.java:301) ~[?:?]
    at java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566) [?:?]
    at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) [?:?]
    at cz.organovabanka.bluetooth.manager.transport.dbus.BluezFactory.getDiscoveredDevices(BluezFactory.java:301) [219:organovabanka.org.eclipse.smarthome.binding.bluetooth.transport.dbus:1.0.0.SNAPSHOT]
    at org.sputnikdev.bluetooth.manager.impl.BluetoothManagerImpl$DeviceDiscoveryJob.discoverDevices(BluetoothManagerImpl.java:690) [214:org.sputnikdev.org.eclipse.smarthome.binding.bluetooth:1.1.5]
    at org.sputnikdev.bluetooth.manager.impl.BluetoothManagerImpl$DeviceDiscoveryJob.run(BluetoothManagerImpl.java:683) [214:org.sputnikdev.org.eclipse.smarthome.binding.bluetooth:1.1.5]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:?]
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:?]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:?]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
    at java.lang.Thread.run(Thread.java:748) [?:?]
Caused by: org.freedesktop.dbus.exceptions.NotConnected: Disconnected
    at sun.reflect.GeneratedConstructorAccessor247.newInstance(Unknown Source) ~[?:?]
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:?]
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:?]
    at org.freedesktop.dbus.Error.getException(Error.java:143) ~[?:?]
    at org.freedesktop.dbus.Error.throwException(Error.java:175) ~[?:?]
    at org.freedesktop.dbus.RemoteInvocationHandler.executeRemoteMethod(RemoteInvocationHandler.java:168) ~[?:?]
    at org.freedesktop.dbus.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:234) ~[?:?]
    at com.sun.proxy.$Proxy154.GetManagedObjects(Unknown Source) ~[?:?]
    at cz.organovabanka.bluetooth.manager.transport.dbus.BluezAdapter.getDevices(BluezAdapter.java:249) ~[?:?]
    ... 13 more

I'll do some more experiments to get some more details.

vkolotov commented 6 years ago

I also got this in the logs:

14:06:56.935 [WARN ] [.impl.AbstractBluetoothObjectGovernor] - Error occurred while interacting (setAlias) with native object: /B8:27:EB:60:0C:43/4C:65:A8:D0:7A:EE : Disconnected
14:06:57.006 [WARN ] [.impl.AbstractBluetoothObjectGovernor] - Error occurred while interacting (setAlias) with native object: /B8:27:EB:60:0C:43/C0:7E:98:8F:F4:28 : Disconnected
14:06:57.026 [ERROR] [h.manager.transport.dbus.BluezAdapter] - /org/bluez/hci2: Failed to remove device /org/bluez/hci2/dev_4C_65_A8_D0_7A_EE, reason: Disconnected
14:06:57.078 [ERROR] [h.manager.transport.dbus.BluezAdapter] - /org/bluez/hci2: Failed to remove device /org/bluez/hci2/dev_C0_7E_98_8F_F4_28, reason: Disconnected
14:06:57.110 [WARN ] [manager.impl.CompletableFutureService] - Bluetooth error happened while competing a future: /B8:27:EB:60:0C:43/4C:65:A8:D0:7A:EE : Error occurred while interacting (setAlias) with native object: /B8:27:EB:60:0C:43/4C:65:A8:D0:7A:EE : Disconnected
14:06:57.189 [WARN ] [.impl.AbstractBluetoothObjectGovernor] - Error occurred while interacting (setAlias) with native object: /B8:27:EB:60:0C:43/E7:AC:07:C9:F7:3F : Disconnected

we probably can ignore these ones as I believe they have the same root cause. Normally Bluez allows you to set bt alias even if device is disconnected.

vkolotov commented 6 years ago

I'll dig further and let you know how it goes. It might be that BM needs some changes as well.

vkolotov commented 6 years ago

It has been running for 1 day without encountering this issue...

Do I understand correctly that we somehow lose connection to DBus? If so, I wonder if we can recover from this?

vkolotov commented 6 years ago

Got that again after roughly a day of observation:

13:57:48.475 [WARN ] [oth.manager.impl.BluetoothManagerImpl] - Device discovery job error
cz.organovabanka.bluetooth.manager.transport.dbus.BluezException: Unable to enumerate bluetooth objects when processing /org/bluez/hci0
    at cz.organovabanka.bluetooth.manager.transport.dbus.BluezAdapter.getDevices(BluezAdapter.java:253) ~[?:?]
    at cz.organovabanka.bluetooth.manager.transport.dbus.BluezFactory.lambda$getDiscoveredDevices$5(BluezFactory.java:301) ~[?:?]
    at java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566) [?:?]
    at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) [?:?]
    at cz.organovabanka.bluetooth.manager.transport.dbus.BluezFactory.getDiscoveredDevices(BluezFactory.java:301) [219:organovabanka.org.eclipse.smarthome.binding.bluetooth.transport.dbus:1.0.0.SNAPSHOT]
    at org.sputnikdev.bluetooth.manager.impl.BluetoothManagerImpl$DeviceDiscoveryJob.discoverDevices(BluetoothManagerImpl.java:690) [214:org.sputnikdev.org.eclipse.smarthome.binding.bluetooth:1.1.5]
    at org.sputnikdev.bluetooth.manager.impl.BluetoothManagerImpl$DeviceDiscoveryJob.run(BluetoothManagerImpl.java:683) [214:org.sputnikdev.org.eclipse.smarthome.binding.bluetooth:1.1.5]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:?]
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:?]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:?]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
    at java.lang.Thread.run(Thread.java:748) [?:?]
Caused by: org.freedesktop.dbus.exceptions.NotConnected: Not Connected
    at org.freedesktop.dbus.RemoteInvocationHandler.executeRemoteMethod(RemoteInvocationHandler.java:137) ~[?:?]
    at org.freedesktop.dbus.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:234) ~[?:?]
    at com.sun.proxy.$Proxy137.GetManagedObjects(Unknown Source) ~[?:?]
    at cz.organovabanka.bluetooth.manager.transport.dbus.BluezAdapter.getDevices(BluezAdapter.java:249) ~[?:?]
    ... 13 more
13:57:48.743 [WARN ] [.impl.AbstractBluetoothObjectGovernor] - Error occurred while interacting (setAlias) with native object: /00:1A:7D:DA:71:05/E7:AC:07:C9:F7:3F : Not Connected
13:57:48.827 [WARN ] [.impl.AbstractBluetoothObjectGovernor] - Error occurred while interacting (setAlias) with native object: /00:1A:7D:DA:71:04/E7:AC:07:C9:F7:3F : Not Connected

Slightly different message though.

xrucka commented 5 years ago

Ok, I was able to replicate the issue in the end; it should be solved upon migration to dbus-java-3.0.0. However, I got a different issue - what are the error policies upon failing connections? Generaly, following can occur when connect() is invoked: 1) no reply ... - generic failed connect, should I throw in BluetoothInteractionException, or NotReadyYet? 2) allready pending 3) connection caused by software abort (which can mean pretty much anything, but usualy a new attempt to connect is required).

vkolotov commented 5 years ago

Sorry @xrucka , just noticed this. I'll think aand reply in the morning.

xrucka commented 5 years ago

Sorry @xrucka , just noticed this. I'll think aand reply in the morning.

Great, thanks :-)

vkolotov commented 5 years ago

Ok, I was able to replicate the issue in the end; it should be solved upon migration to dbus-java-3.0.0. However, I got a different issue - what are the error policies upon failing connections? Generaly, following can occur when connect() is invoked:

  1. no reply ... - generic failed connect, should I throw in BluetoothInteractionException, or NotReadyYet?
  2. allready pending
  3. connection caused by software abort (which can mean pretty much anything, but usualy a new attempt to connect is required).

Hi @xrucka, error handling is something that potentially needs to be still adjusted/tweaked. Generally speaking, BM does not differentiate errors. It is either good or bad. If any error happens, it tries to recover governors state by resetting governors, and initialising them again. I wanted to keep it this simple. This approach works ok, I did not have any issues so far. However, there is a special error/exception that still needs to be addressed in a special way. This is when something happens that requires resetting not only corresponding governor but also its parent object (e.g. if device can't connect because adapter is acting up and needs to be reset). Have a look here, I've made a skeleton for this but never implemented this: https://github.com/sputnikdev/bluetooth-manager/blob/d239d59a5368d05d06f59dc62faeb3e355716f23/src/main/java/org/sputnikdev/bluetooth/manager/impl/AbstractBluetoothObjectGovernor.java#L189

Having said that, if DBus transport somehow can identify this errors, it should probably throw the Fatal exception. But honestly speaking, I faced that kind of issues only once or two, in other words I could not reliable reproduce this sort of errors hence it is not implemented.

Good finding regarding error #2. In theory it should not happen as BM insures that only 1 thread is handling a governor (hence transport object) when managing connection. However, we still can add some special login in there, e.g. if this type of error occurs, then skip connection attempt. Having said that, this might not be a good idea, because we never know what is the reason of that error, maybe a corresponding device is in fail state and needs to be reset? I'd leave it as simple as possible honestly speaking. If you still want to implement something special, then we should think how we could recover from that state anyway, e.g. calculating number of this error and if it exceeds a limit of connection attempts it resets governor, something like that... But as I say I'd make it simple...

To sum up, BM does not care what errors happen on transport level, it tries to recover from any errors by resetting corresponding governors. If any specific logic needs to be implemented to handle a specific error, this logic should have an exit criteria (e.g. when to stop waiting and initiate reset).

I hope I answered your questions, let me know.

vkolotov commented 5 years ago

Just thinking a bit more about #2 type of error. This reminded me that connection operation is not synchronous in both bluggiga and tinyb transports. Hence... if update rate in BM is set to something low, we could get a situation when a governor could try to issue repetitive connection attempts (if a device is to slow to establish connection). Yeah, this might need some tweaking.

xrucka commented 5 years ago

Hi, mostly you did. Based upon your answer, I'll go for BluetoothInteractionException for #1, (internal counter < (perhaps configurable) threshold) ? return true : BluetoothInteractionException for #2 and BluetoothInteractionException for #3. And if I detect stale transport object, I'll fire FatalBluetoothException. The transport has been written like this allready I believe, but this is something to better have confirmed.

Btw, I'm thinking about adding a full reset option for adapters to the transport; invoking hciconfig hciX reset upon power off. It seems this might help with some strange bugs.