weliem / bluez_inc

A C library for Bluez (BLE) that hides all DBus communication. It doesn't get easier than this. This library can also be used in C++.
MIT License
84 stars 19 forks source link

Connect failed (error 36: GDBus.Error:org.bluez.Error.Failed.... #17

Closed krimp closed 7 months ago

krimp commented 10 months ago

In my attempt to use this library to implement a Central for the Nordic Uart GATT on a RPi4B 64bit, Bullseye, Bluez 5.70, I am facing new issues.

I am using the Central-example as the starting point and the following uuid:

#define NUS_CHARACTERISTIC_TX_UUID  "6e400002-b5a3-f393-e0a9-e50e24dcca9e"
#define NUS_CHARACTERISTIC_RX_UUID  "6e400003-b5a3-f393-e0a9-e50e24dcca9e"
#define NORDIC_UART_SERVICE         "6e400001-b5a3-f393-e0a9-e50e24dcca9e"

I would like to connect to a pre-determined BLE device, and the on_scan_result() looks like this:

void on_scan_result(Adapter *adapter, Device *device) {
    char *deviceToString = binc_device_to_string(device);
    log_debug(TAG, deviceToString);
    g_free(deviceToString);
    const char* name    = binc_device_get_name(device);
    const char* address = binc_device_get_address(device);

    if (address != NULL && g_str_has_prefix(address, "E7:FA:F2:24:BF:B6")){
        if (name != NULL && g_str_has_prefix(name, "nRF Uart 0")) {
            binc_device_set_connection_state_change_cb(device, &on_connection_state_changed);
            binc_device_set_services_resolved_cb(device, &on_services_resolved);
            binc_device_set_bonding_state_changed_cb(device, &on_bonding_state_changed);
            binc_device_set_read_char_cb(device, &on_read);
            binc_device_set_write_char_cb(device, &on_write);
            binc_device_set_notify_char_cb(device, &on_notify);
            binc_device_set_notify_state_cb(device, &on_notification_state_changed);
            binc_device_set_read_desc_cb(device, &on_desc_read);
            binc_device_connect(device);
            //binc_adapter_stop_discovery(default_adapter);
        }
    }

At the 'nRF Uart 0' device, I can see that it accepts the connection to the RPi Central, but the RPi Central (this library) gives the error message as given below. Sometimes the RPi Central goes into an infinite loop with the output as given below, and sometimes it halts after the first error incident.

2023-10-30 07:12:36:429 DEBUG [Device] Connecting to 'nRF Uart 0' (E7:FA:F2:24:BF:B6) (BOND_NONE)
2023-10-30 07:12:36:429 DEBUG [Main] 'nRF Uart 0' (E7:FA:F2:24:BF:B6) state: CONNECTING (2)
2023-10-30 07:12:46:579 DEBUG [Main] 'nRF Uart 0' (E7:FA:F2:24:BF:B6) state: CONNECTED (1)
2023-10-30 07:12:47:106 ERROR [Device] Connect failed (error 36: GDBus.Error:org.bluez.Error.Failed: le-connection-abort-by-local)
2023-10-30 07:12:47:106 DEBUG [Main] (dis)connect failed (error 36: GDBus.Error:org.bluez.Error.Failed: le-connection-abort-by-local)
2023-10-30 07:13:00:387 DEBUG [Main] Device{name='nRF Uart 0', address='E7:FA:F2:24:BF:B6', address_type=random, rssi=-69, uuids=[6e400001-b5a3-f393-e0a9-e50e24dcca9e], manufacturer_data=[], service_data=[], paired=false, txpower=-255 path='/org/bluez/hci0/dev_E7_FA_F2_24_BF_B6' }

Google tells me that others that have experienced these issues (error 36) blames the BlueZ stack. I have BlueZ 5.70, but have tried to downgrade to 5.55 and 5.37 without any luck.

I'm a bit stuck. Any ideas?

weliem commented 10 months ago

I see you are not stopping the discovery. I would recommend to do that before connecting. Some adapters cannot connect properly while a discovery is on progress.

If that doesn't help, it is probably a Bluez issue....

krimp commented 10 months ago

I have tried both with and without stopping the discovery. No difference i behaviour.

Using sudo hcitool lescan I can see that my nRF Uart shows itself with two services:

E7:FA:F2:24:BF:B6 nrf Uart 0
E7:FA:F2:24:BF:B6 (unknown)

I guess that is because it also have the DFU service on board. How is multiple device addresses handled by the library? I can see that in the adapter.c, in binc_set_discovery_filter(), the DuplicateData is set true. Is that going to have some consequences for devices having two different services?

    GVariantBuilder *arguments = g_variant_builder_new(G_VARIANT_TYPE_VARDICT);
    g_variant_builder_add(arguments, "{sv}", "Transport", g_variant_new_string("le"));
    g_variant_builder_add(arguments, "{sv}", DEVICE_PROPERTY_RSSI, g_variant_new_int16(rssi_threshold));
    g_variant_builder_add(arguments, "{sv}", "DuplicateData", g_variant_new_boolean(TRUE));
krimp commented 10 months ago

I tried to remove the DFU-service from the nRF Uart 0-device, and now it connects. Can it be that the library (or BlueZ) does not cope with two different services on the same address?

My hypothesis is that if the RPi tries to connect to the DFU, which is encrypted, the nRF Uart reject the connection, and hence the error le-connection-abort-by-local?

It is a bit odd, since I assume the library should filter on the service-uuid?

weliem commented 10 months ago

It is totally normal that a peripheral support multiple services. The devices I use to test all have that. So that's not it.

You can connect so that works. However, it seems that the service discovery is failing. Are you able to discover the services if you use 'bluetoothctl' on the command line? And when using applications like 'nRF Connect' on iOS/Android?

I suspect you get the same disconnect issue when using 'bluetoothctl'....if not, then it might be a library issue.

If you can't stay connected when using 'nRF connect' on a mobile phone, there may be something wrong with how you defined your services on the peripheral.

krimp commented 10 months ago

A good friend of mine said "There are no such thing as ONE error".

Error 1: The reason for one of the peripheral disconnect cases was a sporadic HW failure on the peripheral. The crystal for the RTC suffered occasionally bad connection which lead to timing issues. Due to the timing issues, the peripheral initiated a disconnect, and hence the le-connection-abort-by-local.

Error 2: It seems to me that something has changed on the RPi during the last years with respect to BLE. My SW has been running without issues for ~1.5 year. However, somewhere in the latest Buster upgrades (and in Bullseye) something has changed. I'm not skilled enough in the inner workings of BLE/BlueZ/DBus to spot what. Earlier on I could connect to a peripheral without initiating a scan from the SW. To solve the issues I have faced, I had to change this to first try to connect without a scan, and if it fails, do a scan and then connect. It seems to me like somewhere along the path there is introduced buffering on the RPi, where the BLE device info is kept in the buffer for a certain time and then forgotten. To me it seems also like the need for a scan differs and is based on DBus and BlueZ versions as well as if the library is mostly DBus dependent or not.

The "good" thing is that most of my problems were related to my lack of knowledge, and not the libraries used. Now everything seems to work like expected.

weliem commented 10 months ago

Good to hear you got it working and that it is not a library issue. I have been working with Bluez for a while now but I feel the quality of Bluez is a bit inconsistent. Some versions are more stable than others and newer isn't always better....