zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.87k stars 6.62k forks source link

bt_le_adv_update_data() assertion fail #38101

Closed nunof closed 3 years ago

nunof commented 3 years ago

Describe the bug On a nRF52832, I get the following, below, crash when executing bt_le_adv_update_data(ad, ARRAY_SIZE(ad), sc, ARRAY_SIZE(sc)); Neither 'ad' os 'sc' have changed, just testing the ability to update. If I don't update advert data, the advertising is fine.

Got the following Kconfig (just BLE releated here), am I missing anything? CONFIG_BT=y CONFIG_BT_DEBUG_LOG=y CONFIG_BT_PERIPHERAL=y CONFIG_BT_DEVICE_NAME="Z"

Logs and console output On a previous version of 2.6 `ASSERTION FAIL [err == 0] @ ZEPHYR_BASE/subsys/bluetooth/host/hci_core.c:307

k_sem_take failed with err -11

[00:00:03.273,376] os: r0/a1: 0x00000003 r1/a2: 0x00000000 r2/a3: 0x20000c78 [00:00:03.273,406] os: r3/a4: 0x00000003 r12/ip: 0x0a31312d r14/lr: 0x000154b1 [00:00:03.273,406] os: xpsr: 0x41000021 [00:00:03.273,406] os: Faulting instruction address (r15/pc): 0x000154bc [00:00:03.273,406] os: >>> ZEPHYR FATAL ERROR 3: Kernel oops on CPU 0 [00:00:03.273,406] os: Fault during interrupt handling

[00:00:03.273,406] os: Current thread: 0x200013e8 (unknown) [00:00:03.538,543] os: Halting system`

On the most recent version of Zephyr (mentioned below) `** Booting Zephyr OS build zephyr-v2.6.0-2723-ge7436ea862b2 ***

Settings starting

Initializing Settings subsystem

settings subsys initialization: fail (err -45)

BLE

[00:00:00.435,180] bt_hci_core: Failed to set device name (-12) [00:00:00.436,553] bt_hci_core: HW Platform: Nordic Semiconductor (0x0002) [00:00:00.436,584] bt_hci_core: HW Variant: nRF52x (0x0002) [00:00:00.436,584] bt_hci_core: Firmware: Standard Bluetooth controller (0x00) Version 2.6 Build 99 [00:00:00.437,042] bt_hci_core: Identity: 04:03:02:01:6E:56 (public) [00:00:00.437,042] bt_hci_core: HCI: version 5.2 (0x0b) revision 0x0000, manufacturer 0x05f1 [00:00:00.437,042] bt_hci_core: LMP: version 5.2 (0x0b) subver 0xffff Bluetooth initialized

Advertising successfully started

bt_le_adv_update timer called 1

ASSERTION FAIL [err == 0] @ WEST_TOPDIR/zephyr/subsys/bluetooth/host/hci_core.c:305

0m 0m`

Environment (please complete the following information):

Additional context After updating Zephyr, the same code now (also) breaks in other part of code - settings related - but still shows issues with bluetooth - I get Settings subsys now cannot init (new beahviour). BT is failling to set device name (new behaviour) and now I got that ASSERTION_FAIL as a result of calling bt_le_adv_update_data(ad, ARRAY_SIZE(ad), sc, ARRAY_SIZE(sc));

Thanks

cvinayak commented 3 years ago

@nunof Please provide the details to reproduce (Seems you deleted the **To Reproduce** section in the bug report template), also the upstream main commit hash used to reproduce

nunof commented 3 years ago

Regarding how to reproduce, here is link bug tarball.

The upstream commit I have is commit e7436ea862b293927b189fb77de1808d45fb9e4b (HEAD, origin/main)

Just to recapture, the issues I have are:

With the source I'm sharing here. I get the following in the console.

` Booting Zephyr OS build zephyr-v2.6.0-2723-ge7436ea862b2

Initializing Settings subsystem

settings subsys initialization: fail (err -45)

[00:00:00.429,962] bt_hci_core: Failed to set device name (-12) [00:00:00.431,304] bt_hci_core: HW Platform: Nordic Semiconductor (0x0002) [00:00:00.431,304] bt_hci_core: HW Variant: nRF52x (0x0002) [00:00:00.431,335] bt_hci_core: Firmware: Standard Bluetooth controller (0x00) Version 2.6 Build 99 [00:00:00.431,793] bt_hci_core: Identity: 04:03:02:01:FB:FA (public) [00:00:00.431,793] bt_hci_core: HCI: version 5.2 (0x0b) revision 0x0000, manufacturer 0x05f1 [00:00:00.431,793] bt_hci_core: LMP: version 5.2 (0x0b) subver 0xffff Bluetooth initialized

Advertising successfully started

bt_le_adv_update timer called 1

ASSERTION FAIL [err == 0] @ WEST_TOPDIR/zephyr/subsys/bluetooth/host/hci_core.c:305

k_sem_take failed with err -11

[00:00:03.432,670m 0m`

Thank you for your support.

cvinayak commented 3 years ago

I am able to reproduce the observed ASSERTION FAIL [err == 0] @ WEST_TOPDIR/zephyr/subsys/bluetooth/host/hci_core.c:305 issue.

A. Erase flash using nrfjprog -e to ensure reset of setting storage, this will solve settings subsys initialization: fail (err -45)

B. Call settings_load() after bt_enable() is success, not before it. Refer to other Bluetooth Samples on how it is used.

C. Observations in the application used:

  1. Enable CONFIG_ASSERT=y to detect misuse of API in ISRs
  2. bt_le_adv_update_data is called from timer ISR context, which is not supported, use delayed work instead
  3. ble_adv_update_timer is declared and then again defined twice in the application, either use K_TIMER_DEFINE to initialize or explicitly use k_timer_init in main()

As it is application issue, setting the issue to low priority.

nunof commented 3 years ago

Thank you for very useful comments.

A. Ops, I was using platformio and switched to pure Zephyr to reproduce the issue. I think this is just some issue with west flash overwriting storage partition because usually I would want that to persist and not to have to be resetted. It's my bad, I'm sure. I'll look into that.

B. Indeed the app code I sent was crafted to reproduce the behavior I was seeing but is not the real app. Still, there was the equivalent change in the real and I didn't know settings_load() has to be executed before bt_enable(). I have some questions here, if you can help:

  1. This is so even if CONFIG_BT_SETTINGS is disabled, which it is in the sample app?
  2. How can I simultaneously use CONFIG_BT_SETTINGS (for BT subsys internal use) and CONFIG_SETTINGS (for my settings db)? Is it possible?

C1. I will, thanks. C2. This is my real issue that triggered it all. I was under wrong impression that it would run in the system workqueue. My bad. And many thanks. C3. This was just a bug in my sample bug app, but useful comment, many thanks.

cvinayak commented 3 years ago

settings_load() has to be executed before bt_enable()

settings_load() has to be executed after bt_enable(). In your sample it is called before bt_enable.

This is my real issue that triggered it all. I was under wrong impression that it would run in the system workqueue.

Closing this issue, please open new issue if you face anything else.