zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.1k stars 6.2k forks source link

bluetooth: conn disconnection reporting does not work properly with bt_disable #72721

Closed MarekPieta closed 4 days ago

MarekPieta commented 1 month ago

Describe the bug If bt_disable is called from a cooperative context while BLE connection is maintained, the Bluetooth stack does not report disconnection using application’s connection state callback. A spurious disconnection callback is then observed when Bluetooth is reenabled (on subsequent bt_enable call).

To Reproduce Steps to reproduce the behavior: Use following branch: https://github.com/MarekPieta/zephyr/tree/bt_disable_dc_issue_replicate

  1. Build samples/bluetooth/peripheral_hr from the provided branch (I replicated problem on nrf52840dk, but any other board with Bluetooth should replicate it too). The sample periodically calls bt_enable/bt_disable from main context (Bluetooth is enabled for ~10 seconds, then it gets disabled for ~10 seconds). Main thread was switched to cooperative priority to trigger issue replication.
  2. Establish BLE connection while Bluetooth is enabled.
  3. Wait for Bluetooth disable call.
  4. See that disconnection log is missing.
  5. See disconnection log appearing on Bluetooth reenable.

Expected behavior All of the necessary cleanup needs to be done during bt_disable call. The disconnection also needs to be reported during the bt_disable call (it may be reported a bit later, but it should definitely not be delayed until the subsequent bt_enable call).

Impact The bug may cause problems for application logic for tracking connection state. Incomplete cleanup may also lead to edge cases in the Bluetooth stack.

Logs and console output

Booting Zephyr OS build v3.6.0-3858-ga00aa307845e [00:00:00.465,728] bt_hci_core: HW Platform: Nordic Semiconductor (0x0002) [00:00:00.465,759] bt_hci_core: HW Variant: nRF52x (0x0002) [00:00:00.465,789] bt_hci_core: Firmware: Standard Bluetooth controller (0x00) Version 3.6 Build 99 [00:00:00.466,735] bt_hci_core: Identity: E7:11:3E:DB:B4:A8 (random) [00:00:00.466,766] bt_hci_core: HCI: version 5.4 (0x0d) revision 0x0000, manufacturer 0x05f1 [00:00:00.466,796] bt_hci_core: LMP: version 5.4 (0x0d) subver 0xffff [00:00:00.466,796] main: Bluetooth initialized [00:00:00.466,796] main: Starting Legacy Advertising (connectable and scannable) [00:00:00.467,468] main: Advertising successfully started [00:00:00.467,468] main: Checking LED device... [00:00:00.467,468] main: done. [00:00:00.467,468] main: Configuring GPIO pin... [00:00:00.467,498] main: done. [00:00:00.467,529] main: Start blinking LED... [00:00:01.715,423] main: Connected [00:00:02.467,773] main: Stop blinking LED. [00:00:10.468,505] main: Disable Bluetooth [00:00:20.469,451] main: Enable Bluetooth [00:00:20.469,970] main: Disconnected (reason 0x00) [00:00:20.471,405] bt_hci_core: HW Platform: Nordic Semiconductor (0x0002) [00:00:20.471,435] bt_hci_core: HW Variant: nRF52x (0x0002) [00:00:20.471,466] bt_hci_core: Firmware: Standard Bluetooth controller (0x00) Version 3.6 Build 99 [00:00:20.471,923] bt_hci_core: Identity: E7:11:3E:DB:B4:A8 (random) [00:00:20.471,954] bt_hci_core: HCI: version 5.4 (0x0d) revision 0x0000, manufacturer 0x05f1 [00:00:20.471,984] bt_hci_core: LMP: version 5.4 (0x0d) subver 0xffff [00:00:20.472,015] main: Starting Legacy Advertising (connectable and scannable) [00:00:20.472,778] main: Start blinking LED...

Environment (please complete the following information):

Additional context On bt_disable call, bt_conn_cleanup_all updates state of each Bluetooth connection. It also seems to prepare for conn cleanup atomic_set_bit(conn->flags, BT_CONN_CLEANUP); + k_poll_signal_raise(&conn_change, 0); in bt_conn_set_state.

alwa-nordic commented 1 week ago

@sean-madigan Do you have an idea of what might be causing this issue?

sean-madigan commented 1 week ago

I presume we need to yield somewhere after cleaning up the connections

alwa-nordic commented 1 week ago

Reproduced on bsim: https://github.com/alwa-nordic/zephyr/tree/bsim-disable-disconnect-missing

alwa-nordic commented 4 days ago

This issue is not present in Zephyr main (Zephyr 3.7 rc).

jori-nordic commented 4 days ago

Closing as fixed