espressif / esp-nimble

A fork of NimBLE stack, for use with ESP32 and ESP-IDF
Apache License 2.0
76 stars 49 forks source link

ble_hs_resolv.c: ble_rpa_peer_dev_rec_clear_all() does not seem to remove all bonds #35

Closed DrSegatron closed 1 year ago

DrSegatron commented 2 years ago

Expected result: All bonds in memory and in NVS are removed when calling ble_hs_resolv_list_clear_all();

Actual result: Some devices still in memory. Some devices still in NVS after a reboot, getting repopulated into memory.

Not sure what the function is actually doing and why, but it exhibits some code smells:

  1. static void
  2. ble_rpa_peer_dev_rec_clear_all(void)
  3. {
  4. uint8_t i;
  5. /* As NVS record need to be deleted one by one, we loop through
    • peer_records */
  6. for (i = 0; i < ble_store_num_peer_dev_rec; i++) {
  7. ble_store_num_peer_dev_rec--;
  8. if ((i != ble_store_num_peer_dev_rec) && (ble_store_num_peer_dev_rec != 0)) {
  9. memmove(&peer_dev_rec[i], &peer_dev_rec[i + 1],
  10. (ble_store_num_peer_dev_rec - i + 1) * sizeof(struct ble_hs_dev_records ));
  11. }
  12. ble_store_persist_peer_records();
  13. }
  14. return;
  15. }

I suppose the entire peer_dev_rec[] array is to be iterated, but there is both an incremented index i at line 8 AND shrinking of the array on lines 9 and 12-13. It looks like some of the entries could be skipped if the condition on line 11 is true.

Workaround: I just clear the entire "nimble_bond" NVS namespace with nvs_erase_all(). But why the comment on like 6-7?

veneno-529 commented 1 year ago

Hi, I am facing similar issue. Not all bonds are getting cleared few are getting retained in the memory. Can you paste snippet of your code for clearing nimble_bond using nvs_erase_all()

DrSegatron commented 1 year ago

Hi, I am facing similar issue. Not all bonds are getting cleared few are getting retained in the memory. Can you paste snippet of your code for clearing nimble_bond using nvs_erase_all()

    nvs_handle_t nimble_handle;     //Workaround: ble_hs_resolv_list_clear_all() does not delete all bonds
    nvs_open("nimble_bond", NVS_READWRITE, &nimble_handle);
    nvs_erase_all(nimble_handle);
    nvs_commit(nimble_handle);
    nvs_close(nimble_handle);
rahult-github commented 1 year ago

Recently a commit was made to handle clearing of devices. Is above issue seen even after that change ?

veneno-529 commented 1 year ago

@rahult-github Yes, it is working now to remove all bonds. But another issue has raised because of it. Once all bonds are deleted then again if I try to bond the device it gets bonded. Problem occurs when i disconnect the device and again try to connect it. It doesn't get connected. Log before disconnecting E (402552) MoveUp: ble_rpa_get_num_peer_dev_records() returned 1 DEBUG MARK 657 E (402562) MoveUp: peer_dev_record index 0: rec_used = 1 E (402569) MoveUp: peer_dev_record index 0: rand_addr_type = 0 E (402576) MoveUp: peer_dev_record index 0: pseudo_addr = 68:73:ea:5d:24:eb E (402583) MoveUp: peer_dev_record index 0: rand_addr = 68:73:ea:5d:24:eb E (402591) MoveUp: peer_dev_record index 0: identity_addr = 2c:be:eb:27:93:6f E (402599) MoveUp: peer_dev_record index 0: peer_addr = 2c:be:eb:27:93:6f E (402606) MoveUp: peer_dev_record index 0: irk_present = 1 E (402612) MoveUp: ============================================================================= Log after trying to connect same device after disconnecting I (402628) NimBLE: GAP procedure initiated: advertise; I (402629) NimBLE: disc_mode=2 I (402631) NimBLE: adv_channel_map=0 own_addr_type=3 adv_filter_policy=0 adv_itvl_min=0 adv_itvl_max=0 I (402642) NimBLE: I (406160) GAP: BLE GAP EVENT CONNECT OK! I (406704) NimBLE: connection updated; status=0 I (406705) NimBLE: handle=0 our_ota_addr_type=1 our_ota_addr= I (406705) NimBLE: 79:b3:a5:fa:1f:df I (406709) NimBLE: our_id_addr_type=1 our_id_addr= I (406715) NimBLE: 6b:f9:89:45:12:2d I (406719) NimBLE: peer_ota_addr_type=1 peer_ota_addr= I (406725) NimBLE: 77:2a:b4:f8:b7:0b I (406729) NimBLE: peer_id_addr_type=1 peer_id_addr= I (406735) NimBLE: 77:2a:b4:f8:b7:0b I (406739) NimBLE: conn_itvl=6 conn_latency=0 supervision_timeout=500 encrypted=0 authenticated=0 bonded=0

I (406750) NimBLE:

lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2) I (408455) GAP: BLE GAP EVENT DISCONNECT HOST TERMINATED! I (408456) NimBLE: RPA enabled enabled; status=1 I (408459) NimBLE: E (408466) MoveUp: ble_rpa_get_num_peer_dev_records() returned 0

The above issue occurs only if I ever call ble_rap_peer_dev_rec_clear_all() function. If I don't call this function ever during the course of runtime then no connection issue is after post bonding

rahult-github commented 1 year ago

Hi @veneno-529 ,

This looks to be abrubt disconnection :

lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2) I (408455) GAP: BLE GAP EVENT DISCONNECT HOST TERMINATED!

May i know the chipset and idf version being used at your end ?

veneno-529 commented 1 year ago

Hi @rahult-github I'm using ESP-IDF v5.0 and chipset is ESP32-WROOM-32E. I'm able to regenerate issue repeatedly. If I erase flash of esp32 and upload software. Then android phone can be bonded, disconnected and reconnected again successfully even after resetting the esp32. If I disconnect the android phone (Android 13) and call function ble_hs_resolv_list_clear_all() all bonds are cleared. Now I connect and bond the same android device, it bonds and connects works fine until I disconnect and reconnect. When I reconnect the above log appears and connection fails.

veneno-529 commented 1 year ago

@rahult-github When I increased the Log level I found out that when connecting again the log says

D (183768) NimBLE: RPA resolvable but RL doesn't exist; remove peer rec at index = 0 D (183778) NimBLE: RPA: removed device at index = 0, no. of peer records = 0

Even though after disconnecting the log said : E (181788) NimBLE_BLE_PRPH: ble_rpa_get_num_peer_dev_records() returned 1 E (181798) NimBLE_BLE_PRPH: peer_dev_record index 0: rec_used = 1 E (181808) NimBLE_BLE_PRPH: peer_dev_record index 0: rand_addr_type = 0 E (181818) NimBLE_BLE_PRPH: peer_dev_record index 0: pseudo_addr = 42:f5:47:a2:ef:05 E (181818) NimBLE_BLE_PRPH: peer_dev_record index 0: rand_addr = 42:f5:47:a2:ef:05 E (181828) NimBLE_BLE_PRPH: peer_dev_record index 0: identity_addr = 2c:be:eb:27:93:6f E (181838) NimBLE_BLE_PRPH: peer_dev_record index 0: peer_addr = 2c:be:eb:27:93:6f E (181848) NimBLE_BLE_PRPH: peer_dev_record index 0: irk_present = 1 E (181858) NimBLE_BLE_PRPH: =============================================================================

I am still trying to dig deep why after disconnecting the records were still stored and when trying to reconnect the records are deleted.

@DrSegatron Unfortunately Your workaround is not working for me. It is not erasing bonds at all.

veneno-529 commented 1 year ago

Hi @rahult-github In the following code for rl list clear all g_ble_hs_resolv_data.rl_cnt is set tto 0. `void ble_hs_resolv_list_clear_all(void) { g_ble_hs_resolv_data.rl_cnt = 0; memset(g_ble_hs_resolv_list, 0, BLE_RESOLV_LIST_SIZE * sizeof(struct ble_hs_resolv_entry));

/* Now delete peer device records as well */
ble_rpa_peer_dev_rec_clear_all();

return;

}`

When the device tries to bond or the first time after RL is cleared it's resolving list count is set to 1 i.e (g_ble_hs_resolve_data.rl_cnt increments) Now when device disconnects and tries to reconnect the ble first checks whether device is on RL is or not using below function

`struct ble_hs_resolv_entry ble_hs_resolv_list_find(uint8_t addr) { int i; struct ble_hs_resolv_entry *rl = &g_ble_hs_resolv_list[1];

for (i = 1; i < g_ble_hs_resolv_data.rl_cnt; ++i) {
    if (memcmp(rl->rl_identity_addr, addr, BLE_DEV_ADDR_LEN) == 0) {
        return rl;
    }

    if (memcmp(rl->rl_pseudo_id, addr, BLE_DEV_ADDR_LEN) == 0) {
        return rl;
    }

    if (memcmp(rl->rl_peer_rpa, addr, BLE_DEV_ADDR_LEN) == 0) {
        return rl;
    }
    ++rl;
}
return NULL;

}`

In the above function if you notice for loop starts with i = 1 and checks condition i < g_ble_hs_resolve_data.rl_cnt and pre increments ++i; Here what is happening is since RL count was set to 0 previously by clear all function and when new device was added Rl count was increased to 1. Now when it enters for loop it fails to check the condition as i < 1 fails. Thus it returns NULL.

This return value is in turn giving the log from function static struct ble_hs_resolv_entry * ble_rpa_find_rl_from_peer_records(uint8_t *peer_addr, uint8_t *peer_addr_type)

Which is why it executes `/* Peer record exists, but RL does not

I guess I have found the cause of the problem. What can be the solution of it? please suggest

rahult-github commented 1 year ago

Hi @veneno-529 ,

does changing

"for (i = 1; i < g_ble_hs_resolv_data.rl_cnt; ++i) { "

to

"for (i = 1; i <= g_ble_hs_resolv_data.rl_cnt; ++i) {"

help ?

veneno-529 commented 1 year ago

Hello @rahult-github No changing it didn't help. I forgot to mention I have even tried making g_ble_hs_resolv_data.rl_cnt = 1; in clear all function. But still it is failing and saying RPA resolvable but RL doesn't exist.

SumeetSingh19 commented 1 year ago

Hi @veneno-529 , I tried to reproduce this issue on v5.0, however in my case the reconnection is happening after clearing the RL and bonding again. There still appears to be an issue though, that after reconnection, the bond is gone. I suspect that this may be because you are only clearing the RL entry and not the complete bond. If you want to clear the bond, I would recommend using ble_gap_unpair(), and pass in the peer address. This will disconnect from the peer, clear the RL entry, the CCCDs and the shared keys. Rebonding should work fine after doing this.

Please try this and share if the issue is resolved on your end.

veneno-529 commented 1 year ago

Hi @veneno-529 , I tried to reproduce this issue on v5.0, however in my case the reconnection is happening after clearing the RL and bonding again. There still appears to be an issue though, that after reconnection, the bond is gone. I suspect that this may be because you are only clearing the RL entry and not the complete bond. If you want to clear the bond, I would recommend using ble_gap_unpair(), and pass in the peer address. This will disconnect from the peer, clear the RL entry, the CCCDs and the shared keys. Rebonding should work fine after doing this.

Please try this and share if the issue is resolved on your end. I have tried it and it works. Thanks you!