Closed MarekPieta closed 4 days ago
@sean-madigan Do you have an idea of what might be causing this issue?
I presume we need to yield somewhere after cleaning up the connections
Reproduced on bsim: https://github.com/alwa-nordic/zephyr/tree/bsim-disable-disconnect-missing
This issue is not present in Zephyr main (Zephyr 3.7 rc).
Closing as fixed
Describe the bug If
bt_disable
is called from a cooperative context while BLE connection is maintained, the Bluetooth stack does not report disconnection using application’s connection state callback. A spurious disconnection callback is then observed when Bluetooth is reenabled (on subsequentbt_enable
call).To Reproduce Steps to reproduce the behavior: Use following branch: https://github.com/MarekPieta/zephyr/tree/bt_disable_dc_issue_replicate
samples/bluetooth/peripheral_hr
from the provided branch (I replicated problem onnrf52840dk
, but any other board with Bluetooth should replicate it too). The sample periodically callsbt_enable
/bt_disable
from main context (Bluetooth is enabled for ~10 seconds, then it gets disabled for ~10 seconds). Main thread was switched to cooperative priority to trigger issue replication.bt_disable
from a preemptive context (setCONFIG_MAIN_THREAD_PRIORITY
to non-negative value to check it locally).Expected behavior All of the necessary cleanup needs to be done during
bt_disable
call. The disconnection also needs to be reported during thebt_disable
call (it may be reported a bit later, but it should definitely not be delayed until the subsequentbt_enable
call).Impact The bug may cause problems for application logic for tracking connection state. Incomplete cleanup may also lead to edge cases in the Bluetooth stack.
Logs and console output
Booting Zephyr OS build v3.6.0-3858-ga00aa307845e [00:00:00.465,728] bt_hci_core: HW Platform: Nordic Semiconductor (0x0002)
[00:00:00.465,759] bt_hci_core: HW Variant: nRF52x (0x0002)
[00:00:00.465,789] bt_hci_core: Firmware: Standard Bluetooth controller (0x00) Version 3.6 Build 99
[00:00:00.466,735] bt_hci_core: Identity: E7:11:3E:DB:B4:A8 (random)
[00:00:00.466,766] bt_hci_core: HCI: version 5.4 (0x0d) revision 0x0000, manufacturer 0x05f1
[00:00:00.466,796] bt_hci_core: LMP: version 5.4 (0x0d) subver 0xffff
[00:00:00.466,796] main: Bluetooth initialized
[00:00:00.466,796] main: Starting Legacy Advertising (connectable and scannable)
[00:00:00.467,468] main: Advertising successfully started
[00:00:00.467,468] main: Checking LED device...
[00:00:00.467,468] main: done.
[00:00:00.467,468] main: Configuring GPIO pin...
[00:00:00.467,498] main: done.
[00:00:00.467,529] main: Start blinking LED...
[00:00:01.715,423] main: Connected
[00:00:02.467,773] main: Stop blinking LED.
[00:00:10.468,505] main: Disable Bluetooth
[00:00:20.469,451] main: Enable Bluetooth
[00:00:20.469,970] main: Disconnected (reason 0x00)
[00:00:20.471,405] bt_hci_core: HW Platform: Nordic Semiconductor (0x0002)
[00:00:20.471,435] bt_hci_core: HW Variant: nRF52x (0x0002)
[00:00:20.471,466] bt_hci_core: Firmware: Standard Bluetooth controller (0x00) Version 3.6 Build 99
[00:00:20.471,923] bt_hci_core: Identity: E7:11:3E:DB:B4:A8 (random)
[00:00:20.471,954] bt_hci_core: HCI: version 5.4 (0x0d) revision 0x0000, manufacturer 0x05f1
[00:00:20.471,984] bt_hci_core: LMP: version 5.4 (0x0d) subver 0xffff
[00:00:20.472,015] main: Starting Legacy Advertising (connectable and scannable)
[00:00:20.472,778] main: Start blinking LED...
Environment (please complete the following information):
84b8e92445f1a613286b476ca113bf99e49d65a7
(Zephyr main)Additional context On
bt_disable
call,bt_conn_cleanup_all
updates state of each Bluetooth connection. It also seems to prepare for conn cleanupatomic_set_bit(conn->flags, BT_CONN_CLEANUP); + k_poll_signal_raise(&conn_change, 0);
inbt_conn_set_state
.bt_disable
call seems to blocktx_thread
from processing the signal (thread is aborted before it can process the signal). Thebt_conn_process_tx
->conn_cleanup
is not called for a connection, the deferred work responsible for finalizing the cleanup is not even scheduled.bt_disable
call allows Bluetooth contexts to preemptbt_disable
execution and perform needed operations. Because of this, disconnection is properly reported in this case.