dariuszseweryn / RxAndroidBle

An Android Bluetooth Low Energy (BLE) Library with RxJava3 interface
http://polidea.github.io/RxAndroidBle/
Apache License 2.0
3.43k stars 583 forks source link

RSSI Read operation sometimes times out #729

Closed OliverCulleyDeLange closed 3 years ago

OliverCulleyDeLange commented 3 years ago

Describe the bug Every second, i poll 2 seperate RxBleConnections to capture RSSI values for each. Note, this interval observable is started as soon as the device is connected - so unlikely to be at the same time as the other device.

Observable.interval(1, TimeUnit.SECONDS)
            .flatMapSingle { controller.getRssi() }

/* in the controller, which owns an RxBleConnection */
    fun getRssi(): Single<RssiResult> = connection.readRssi()
            .map<RssiResult> { RssiResult.RssiReading(getSeconds(), it) }
            .onErrorReturn { RssiResult.RssiError(it) }

This mostly works. However sometimes it errors and returns RssiError(t=com.polidea.rxandroidble2.exceptions.BleGattCallbackTimeoutException: GATT exception from MAC address D8:AB:B4:6F:B3:86, with type BleGattOperation{description='READ_RSSI'})

Here's some logs from it working OK for comparison:

12:02:39.819 23935-24018  D/RxBle#ConnectionOperationQueue: RXBLE LOG QUEUED   ReadRssiOperation(23249374)
12:02:39.821 23935-24080  D/RxBle#ConnectionOperationQueue: RXBLE LOG STARTED  ReadRssiOperation(23249374)
12:02:39.822 23935-24080  I/RxBle#ConnectionOperationQueue: RXBLE LOG RUNNING  ReadRssiOperation{MAC='XX:XX:XX:XX:XX:XX'}
12:02:39.824 23935-24073  D/BluetoothGatt: readRssi() - device: C8:AA:95:FA:BE:08
12:02:39.829 23935-24044  I/RxBle#GattCallback: RXBLE LOG MAC='XX:XX:XX:XX:XX:XX'         onReadRemoteRssi(), status=0, value=-55
12:02:39.833 23935-24080  D/RxBle#ConnectionOperationQueue: RXBLE LOG FINISHED ReadRssiOperation(23249374) in 12 ms

12:02:40.027 23935-24082  D/RxBle#ConnectionOperationQueue: RXBLE LOG QUEUED   ReadRssiOperation(92681556)
12:02:40.028 23935-24084  D/RxBle#ConnectionOperationQueue: RXBLE LOG STARTED  ReadRssiOperation(92681556)
12:02:40.028 23935-24084  I/RxBle#ConnectionOperationQueue: RXBLE LOG RUNNING  ReadRssiOperation{MAC='XX:XX:XX:XX:XX:XX'}
12:02:40.029 23935-24073  D/BluetoothGatt: readRssi() - device: EB:E1:02:01:69:E6
12:02:40.031 23935-24044  I/RxBle#GattCallback: RXBLE LOG MAC='XX:XX:XX:XX:XX:XX'         onReadRemoteRssi(), status=0, value=-51
12:02:40.034 23935-24084  D/RxBle#ConnectionOperationQueue: RXBLE LOG FINISHED ReadRssiOperation(92681556) in 6 ms

The rate of error seems to be correlated with the two rssi requests happening at the same time. I modified my implementation to request RSSI for both devices simultaneously, and within 3 - 4 seconds, the BleGattCallbackTimeoutException's would begin, and it would rarely recover.

Here is the more error prone implementation

    Observable.interval(1, TimeUnit.SECONDS)
            .flatMap { 
                val rssiL = rxBleConnection.readRssi().onErrorReturn {
                    Timber.w("Error reading RSSI for Left: $it")
                    0
                }
                val rssiR = rxBleConnection.readRssi().onErrorReturn {
                    Timber.w("Error reading RSSI for Right: $it")
                    0
                }

                Single.merge(rssiL, rssiR).toObservable()
            }
            .doOnNext {
               ... handle rssi values
            }

Here's some logs of when it first starts off and everything is good

12:05:56.382 24851-24956  D/RxBle#ConnectionOperationQueue: RXBLE LOG QUEUED   ReadRssiOperation(167934672)
12:05:56.385 24851-24999  D/RxBle#ConnectionOperationQueue: RXBLE LOG STARTED  ReadRssiOperation(167934672)
12:05:56.386 24851-24956  D/RxBle#ConnectionOperationQueue: RXBLE LOG QUEUED   ReadRssiOperation(22665929)
12:05:56.388 24851-24999  I/RxBle#ConnectionOperationQueue: RXBLE LOG RUNNING  ReadRssiOperation{MAC='XX:XX:XX:XX:XX:XX'}
12:05:56.389 24851-25004  D/RxBle#ConnectionOperationQueue: RXBLE LOG STARTED  ReadRssiOperation(22665929)
12:05:56.391 24851-25004  I/RxBle#ConnectionOperationQueue: RXBLE LOG RUNNING  ReadRssiOperation{MAC='XX:XX:XX:XX:XX:XX'}
12:05:56.391 24851-24978  D/BluetoothGatt: readRssi() - device: C8:AA:95:FA:BE:08
12:05:56.394 24851-24978  D/BluetoothGatt: readRssi() - device: EB:E1:02:01:69:E6
12:05:56.395 24851-24873  I/RxBle#GattCallback: RXBLE LOG MAC='XX:XX:XX:XX:XX:XX'         onReadRemoteRssi(), status=0, value=-59
12:05:56.400 24851-24873  I/RxBle#GattCallback: RXBLE LOG MAC='XX:XX:XX:XX:XX:XX'         onReadRemoteRssi(), status=0, value=-59
12:05:56.404 24851-24999  D/RxBle#ConnectionOperationQueue: RXBLE LOG FINISHED ReadRssiOperation(167934672) in 19 ms
12:05:56.410 24851-25004  D/RxBle#ConnectionOperationQueue: RXBLE LOG FINISHED ReadRssiOperation(22665929) in 21 ms

Then one side starts hanging

12:05:58.382 24851-24956 D/RxBle#ConnectionOperationQueue: RXBLE LOG QUEUED   ReadRssiOperation(104695851)
12:05:58.384 24851-24999 D/RxBle#ConnectionOperationQueue: RXBLE LOG STARTED  ReadRssiOperation(104695851)
12:05:58.386 24851-24956 D/RxBle#ConnectionOperationQueue: RXBLE LOG QUEUED   ReadRssiOperation(66208904)
12:05:58.387 24851-24999 I/RxBle#ConnectionOperationQueue: RXBLE LOG RUNNING  ReadRssiOperation{MAC='XX:XX:XX:XX:XX:XX'}
12:05:58.390 24851-25004 D/RxBle#ConnectionOperationQueue: RXBLE LOG STARTED  ReadRssiOperation(66208904)
12:05:58.393 24851-25004 I/RxBle#ConnectionOperationQueue: RXBLE LOG RUNNING  ReadRssiOperation{MAC='XX:XX:XX:XX:XX:XX'}
12:05:58.395 24851-24978 D/BluetoothGatt: readRssi() - device: C8:AA:95:FA:BE:08
12:05:58.400 24851-24978 D/BluetoothGatt: readRssi() - device: EB:E1:02:01:69:E6
12:05:58.409 24851-24873 I/RxBle#GattCallback: RXBLE LOG MAC='XX:XX:XX:XX:XX:XX'         onReadRemoteRssi(), status=0, value=-58
12:05:58.415 24851-25004 D/RxBle#ConnectionOperationQueue: RXBLE LOG FINISHED ReadRssiOperation(66208904) in 25 ms

Then one side queues the operation but it doesn't run

12:05:59.387 24851-24956 D/RxBle#ConnectionOperationQueue: RXBLE LOG QUEUED   ReadRssiOperation(266351258)
12:05:59.391 24851-24956 D/RxBle#ConnectionOperationQueue: RXBLE LOG QUEUED   ReadRssiOperation(251439051)
12:05:59.396 24851-25004 D/RxBle#ConnectionOperationQueue: RXBLE LOG STARTED  ReadRssiOperation(251439051)
12:05:59.398 24851-25004 I/RxBle#ConnectionOperationQueue: RXBLE LOG RUNNING  ReadRssiOperation{MAC='XX:XX:XX:XX:XX:XX'}
12:05:59.400 24851-24978 D/BluetoothGatt: readRssi() - device: EB:E1:02:01:69:E6
12:05:59.405 24851-24873 I/RxBle#GattCallback: RXBLE LOG MAC='XX:XX:XX:XX:XX:XX'         onReadRemoteRssi(), status=0, value=-59
12:05:59.412 24851-25004 D/RxBle#ConnectionOperationQueue: RXBLE LOG FINISHED ReadRssiOperation(251439051) in 16 ms

Smartphone (please complete the following information):

Additional context This issue also seems to affect writing data to custom Characteristics on the device. For example, elsewhere i might be doing connection.writeCharacteristic(uuid, data) - and this will also hang, and not emit any items.

OliverCulleyDeLange commented 3 years ago

Some extra info for you. I did some CPU profiling to see what gets called from where as i noticed all sorts of threads in action in the logs.

When it works: image When it doesn't: image

dariuszseweryn commented 3 years ago

Hello. It indeed seems that the OS sometimes does not use RSSI callback. This can be mitigated by simply retrying the call as BluetoothGatt does not block execution for this call. What is the behaviour you expect?

OliverCulleyDeLange commented 3 years ago

Since I am polling once per second, I am effectively retrying, but the retries also timeout.

I expect the RSSI read operation to complete successfully, and not timeout. Or at least, in the event of a timeout, the retry should complete successfully. I'm unsure whether the timeout of other writeCharacteristic calls is a cause or effect, but its certainly related.

For example, I have a function to write other commands to the device:

connection.writeCharacteristic(characteristic, data)
                .map<WriteDataResult> { WriteDataResult.Wrote(command) }
                .onErrorReturn { WriteDataResult.WriteError(command, it) }
                .toObservable()
                .startWith(WriteDataResult.Writing(command))

Here's a sample of logs which show the writeCharacteristic observable doesn't emit any items, until the RSSI read operations errors, at which point it emits successfully, and subsequent commands execute in normal time.

2020-11-11T17:50:24.736Z 🟩 I/(2) : EVENT: Identify(side=left)
2020-11-11T17:50:24.738Z 🟦 D/(2) : RESULT: Writing left : IDENTIFY_DEVICE
2020-11-11T17:50:25.649Z 🟩 I/(2) : EVENT: Identify(side=left)
2020-11-11T17:50:25.650Z 🟦 D/(2) : RESULT: Writing left : IDENTIFY_DEVICE
2020-11-11T17:50:27.252Z 🟩 I/(2) : EVENT: Identify(side=left)
2020-11-11T17:50:27.253Z 🟦 D/(2) : RESULT: Writing left : IDENTIFY_DEVICE
2020-11-11T17:50:29.208Z 🟩 I/(2) : EVENT: Identify(side=left)
2020-11-11T17:50:29.210Z 🟦 D/(2) : RESULT: Writing left : IDENTIFY_DEVICE
...
2020-11-11T17:50:45.322Z 🟦 D/(2) : RESULT: Writing left : IDENTIFY_DEVICE
2020-11-11T17:50:51.799Z 🟦 D/(1951) : RESULT: RssiError(side=left, t=com.polidea.rxandroidble2.exceptions.BleGattCallbackTimeoutException: GATT exception from MAC address D1:17:84:1F:61:CD, with type BleGattOperation{description='READ_RSSI'})
2020-11-11T17:50:52.119Z 🟦 D/(1957) : RESULT: Wrote left : IDENTIFY_DEVICE
2020-11-11T17:50:57.702Z 🟩 I/(2) : EVENT: Identify(side=left)
2020-11-11T17:50:57.704Z 🟦 D/(2) : RESULT: Writing left : IDENTIFY_DEVICE
2020-11-11T17:50:57.756Z 🟦 D/(1957) : RESULT: Wrote left : IDENTIFY_DEVICE
2020-11-11T17:50:59.130Z 🟩 I/(2) : EVENT: Identify(side=left)
2020-11-11T17:50:59.132Z 🟦 D/(2) : RESULT: Writing left : IDENTIFY_DEVICE
2020-11-11T17:50:59.315Z 🟦 D/(1957) : RESULT: Wrote left : IDENTIFY_DEVICE

Also, I find it odd that this behavior is exaggerated when two RSSI read operations are triggered at the same time.

dariuszseweryn commented 3 years ago

Android OS does allow only a single BLE command to be executed for a single connection at any given moment. The RSSI read seems to be an exception but for stability it is treated the same as other commands.

If the RSSI is waiting for a response to arrive then all other commands (or subsequent RSSI reads) on the same connection are suspended. This is by design.

The library cannot (if I can tell) make the OS give a callback for all RSSI reads. This looks like an OS bug.

OliverCulleyDeLange commented 3 years ago

So while I'm polling RSSI, I'm transmitting data over a characteristic notification at 20hz, on both connections. Could this impact the RSSI read returning? I couldn't find much information about the priority of commands, but i assume RSSI read is lower priority than characteristic notifications? Or is this not handled by RxAndroidBle? I assumed this wouldn't affect it because the RSSI read seems stable (ish) when requested at different times, even with 20hz of notifications happening simultaneously.

dariuszseweryn commented 3 years ago

I couldn't find much information about the priority of commands, but i assume RSSI read is lower priority than characteristic notifications? Or is this not handled by RxAndroidBle?

You are mixing two things. Notifications per se are not commands — they are incoming callbacks. Reading RSSI is a command that is synchronised with RSSI result callback. If notifications are setup prior to requesting RSSI via a command then notifications should not be impacted in any way unless the OS stops giving callbacks.

OliverCulleyDeLange commented 3 years ago

Great information, thanks. Given this, if a phones bluetooth is 'maxed out' with the incoming notifications, could this cause the time out? I'm trying to understand the relationship of the various bits of data passed over the connection, and how they are prioritized, if at all. I've seen instances of phones with poor bluetooth chips, max at at well below our 20hz transmission rate.

dariuszseweryn commented 3 years ago

I've seen instances of phones with poor bluetooth chips, max at at well below our 20hz transmission rate.

This is something entirely different. Different phones have different BT chips. Some handle several packets per connection interval some only a single one. YMMV and it has nothing to do with the library. You may try always using highest connection priority, that's all.

OliverCulleyDeLange commented 3 years ago

Ok, i gather from that, that the library does no form of prioritization. Looks like i need to dig into the Android BLE docs a bit more then. Thanks for your time and help, I appreciate it!

dariuszseweryn commented 3 years ago

The library prioritises commands — everything that is called on BluetoothGatt object. Usually it is FIFO with an exception of setting notifications if I remember correctly which are prioritised. All callbacks from the system (peripheral) are passes ASAP — no prioritisation here.