NordicSemiconductor / Kotlin-BLE-Library

BSD 3-Clause "New" or "Revised" License
329 stars 33 forks source link

Rediscovering changed services doesn't work #122

Open marczeugs opened 8 months ago

marczeugs commented 8 months ago

In my current project I am trying to use your Android-nRF-Mesh-Library together with this library to implement BLE Mesh node provisioning in an Android project. While my implementation mostly works I am facing issues trying to rediscover the services on the Mesh device after provisioning is done and the mesh provisioning service gets replaced with the mesh proxy service.

Upon first connection to the device when initially connecting to the BLE device to provision it, I call ClientBleGatt#discoverServices() to populate the services list (otherwise the most recent value in the ClientBleGatt#services state flow contains no services and stays that way). This works fine and I can obtain a reference to the Mesh provisioning service.

After provisioning is complete I now want to use the Mesh proxy service to configure the node. At this point calling ClientBleGatt#discoverServices() again breaks the app as the code gets stuck on this mutex lock:

https://github.com/NordicSemiconductor/Kotlin-BLE-Library/blob/8a7d321d3235a971117bf028c6678c90b83b5679/client/src/main/java/no/nordicsemi/android/kotlin/ble/client/main/callback/ClientBleGatt.kt#L438

Since I couldn't figure out why the mutex stayed locked I tried manually unlocking it using reflection, though in this case the code only progresses a bit further since the onServicesDiscovered callback never gets called.

While the ClientBleGatt#services flow does emit a new list of services after the provisioning is complete, it is identical to the list beforehand and still only contains the provisioning service.

I was able to make the provisioning and configuration work by disconnecting and reconnecting to the device, and only calling ClientBleGatt#discoverServices() on the new ClientBleGatt object after the reconnect. In this case the Mesh provisioning service gets returned correctly and is fully functional. Sadly this reconnection step introduces a new point of failure into the app, as the reconnect is a bit unreliable and likes to fail for no discernible reason. Furthermore the reconnecting step constitutes around 50% of the entire time of the provisioning + configuration process, which is another great reason to get rid of it.

Is it possible to rediscover the services without the code getting stuck? Are there any potential issues in my code that could cause this issue?

philips77 commented 8 months ago

Hi, I will look into this.

Few information for you. We'er working on rewriting the Android mesh library to Kotlin. This will break the API, as the new library will be more similar to the iOS version, will have flows, suspended methods, etc. The new library will be based on Kotlin BLE library. We should be releasing an alpha version in few months. It will be possible to export the old configuration to the new library using a JSON file.

Also, I'm working on doing some refactoring in the Kotlin BLE Library. I want it to make independent from our Common library and rename some classes. Also, I'd like to make the :client-android and :client-mock modules those that are included in the apps, not :client, which now just depends on them both. This will be a 2.0 update. I need few weeks to complete.

marczeugs commented 8 months ago

I see, thanks for the response. In that case it will probably make a lot of sense to switch to the Kotlin mesh lib when it releases as the callback based architecture in the current lib doesn't mesh well with the other async/suspend code in the app.

Regardless of the mesh related logic I am still basically just operating a normal BLE device that reads/writes data from/to notifications/characteristics in the mesh callbacks under the hood though, so is it expected behaviour that I am unable to refresh the services without reconnecting? Is there something I missed?

philips77 commented 8 months ago

Dealing with service update is tricky. In theory, the Service Change characteristic should indicate any changes, after which the client should invalidate those and perform a new service discovery. Just calling discoverServices() won't do the job, I think, as Android will just return the cached services using onServicesDiscovered(...) without even trying to do so. There's a hidden API called refresh() in BluetoothGatt, which may be used, accessible in this library using: https://github.com/NordicSemiconductor/Kotlin-BLE-Library/blob/8a7d321d3235a971117bf028c6678c90b83b5679/client-api/src/main/java/no/nordicsemi/android/kotlin/ble/client/api/GattClientAPI.kt#L175-L178

You may give it a try.

You may need to call discoverServices() afterwards. I need to check the actual behavior, which also may be different no different Android versions, of course.

marczeugs commented 8 months ago

I saw that call and tried using it, but sadly the issue remains the same. No matter if the cache is invalidated or not, since the mutex is locked the second discoverServices() call gets stuck. Clearing the cache by itself does not cause services to get rediscovered, right? So the services flow doesn't get updated just because one empties the cache.

philips77 commented 8 months ago

Yes, the mutex thing looks like a bug in the lib.