espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
12.89k stars 7.08k forks source link

bt lib broken in release/v3.3.2 (IDFGH-3770) #5689

Open chegewara opened 3 years ago

chegewara commented 3 years ago

Hi, im having strange crash, which is caused during blufi app connection.

home/osboxes/esp/esp-idf/components/freertos/queue.c:1442 (xQueueGenericReceive)- assert failed!
abort() was called at PC 0x40091060 on core 0
0x40091060: xQueueGenericReceive at /home/osboxes/esp/esp-idf/components/freertos/queue.c:2038

ELF file SHA256: b636eb040990def4

Backtrace: 0x4008eb73:0x3ffdaa10 0x4008eeb8:0x3ffdaa40 0x40091060:0x3ffdaa70 0x40131b72:0x3ffdaab0 0x40146ef2:0x3ffdaae0 0x4013cea5:0x3ffdab10 0x4011a631:0x3ffdab50 0x4011a75a:0x3ffdab80 0x4011807a:0x3ffdabb0 0x40091553:0x3ffdabe0
0x4008eb73: invoke_abort at /home/osboxes/esp/esp-idf/components/esp32/panic.c:716

0x4008eeb8: abort at /home/osboxes/esp/esp-idf/components/esp32/panic.c:716

0x40091060: xQueueGenericReceive at /home/osboxes/esp/esp-idf/components/freertos/queue.c:2038

0x40131b72: osi_mutex_lock at /home/osboxes/esp/esp-idf/components/bt/common/osi/mutex.c:49

0x40146ef2: btc_config_lock at /home/osboxes/esp/esp-idf/components/bt/bluedroid/btc/core/btc_config.c:333

0x4013cea5: btc_storage_remove_bonded_device at /home/osboxes/esp/esp-idf/components/bt/bluedroid/btc/core/btc_storage.c:158

0x4011a631: btc_dm_link_up_evt at /home/osboxes/esp/esp-idf/components/bt/bluedroid/btc/core/btc_dm.c:574

0x4011a75a: btc_dm_sec_cb_handler at /home/osboxes/esp/esp-idf/components/bt/bluedroid/btc/core/btc_dm.c:689

0x4011807a: btc_task at /home/osboxes/esp/esp-idf/components/bt/common/btc/core/btc_task.c:141

0x40091553: vPortTaskWrapper at /home/osboxes/esp/esp-idf/components/freertos/port.c:403

Logs shows it is in bluedroid, due to problem with (xQueueGenericReceive)- assert failed! and i am sure there is enough free heap to create queue. Free heap after bluetooth init:

Coex register schm btdm cb faild
Heap summary for capabilities 0x00001000:
  At 0x3ffb2730 len 15448 free 20 allocated 15232 min_free 0
    largest_free_block 20 alloc_blocks 40 free_blocks 1 total_blocks 41
  At 0x3ffaff10 len 240 free 0 allocated 172 min_free 0
    largest_free_block 0 alloc_blocks 9 free_blocks 0 total_blocks 9
  At 0x3ffb6388 len 7288 free 0 allocated 7172 min_free 0
    largest_free_block 0 alloc_blocks 21 free_blocks 0 total_blocks 21
  At 0x3ffb9a20 len 16648 free 0 allocated 16324 min_free 0
    largest_free_block 0 alloc_blocks 73 free_blocks 0 total_blocks 73
  At 0x3ffd21b0 len 56912 free 0 allocated 56548 min_free 0
    largest_free_block 0 alloc_blocks 83 free_blocks 0 total_blocks 83
  At 0x3ffe0440 len 15072 free 6924 allocated 8052 min_free 5804
    largest_free_block 6876 alloc_blocks 13 free_blocks 3 total_blocks 16
  At 0x3ffe4350 len 113840 free 113804 allocated 0 min_free 113804
    largest_free_block 113804 alloc_blocks 0 free_blocks 1 total_blocks 1
  Totals:
    free 120748 allocated 103500 min_free 119608 largest_free_block 113804

ESP-IDF: v3.3.2-323-gbf0220609 commit bf022060964128556b3d3205b65c5d35df9beef6 (HEAD -> release/v3.3, origin/release/v3.3) gcc version 5.2.0 (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a) After updating toolchain still same problem

All submodules updated Build OS: ubuntu 19.04/ ubuntu 20.04 build with old make on both OSs

Client allows me to share elf file in PM. Thanks for help

@Campou can you help please

chegewara commented 3 years ago

Ok, we found it. My esp-idf version is older:

PS this is bt library that is working: 63e7a37c6c6c5647ed09ff5196c0b76ebd98de16 'components/bt/lib': checked out '1f1002a2c4589d1873fa41c49cb616208082cdb9' is broken

Alvin1Zhang commented 3 years ago

Thanks for reporting, we will look into.

WCCWCC commented 3 years ago

Hi @chegewara ,

Can you help track the behavior of variables (static osi_mutex_t lock).

First rule out the case of no initialization, that is, the function btc_config_init is not executed.

chegewara commented 3 years ago

@WCCWCC Sure, i can help. Is there a particular place you want me to check, or should i follow backtrace and find it?

WCCWCC commented 3 years ago

Hi, @chegewara

This crash is related to this variable, it seems that it was not initialized or was released early

The purpose is to track the execution of variables(static osi_mutex_t lock).

You can write the following statement in the file(components/bt/bluedroid/btc/core/btc_config.c),

osi_mutex_new(&lock); printf("%s %d\n”, func,LINE);

osi_mutex_free(&lock); printf("%s %d\n”, func,LINE);

osi_mutex_lock(...); printf("%s %d\n”, func,LINE);

osi_mutex_unlock(...); printf("%s %d\n”, func,LINE);

slayerjojo commented 3 years ago

make menuconfig --> bluetooth --> bluedroid options --> include ble security module(SMP)

xiewenxiang commented 3 years ago

make menuconfig --> bluetooth --> bluedroid options --> include ble security module(SMP)

Are there any problems with opening these configurations.

slayerjojo commented 3 years ago

I hava same question as you,but I fixed it by found turn on that configuration for smp,

ChromaMaster commented 3 years ago

Hey, the same thing happened to me. I've tested what @WCCWCC suggested above, and it turns out that the BTC task tries to lock that config mutex, but that mutex it's not initialized if the SMP it's not enabled. The btc_config_init() function which initializes that mutex is never called!

I've been trying to find out what introduced this regression since a couple of months ago this was working (I've been using this version before). II haven't found anything...

phatpaul commented 3 years ago

Can also confirm after upgraded my IDF from 3.3.2 to 3.3.3. Got crash with same output when trying to connect via BLE.

I plan to use BLE security in the future, so I'll just enable the SMP module and the problem is gone. Glad there is an easy workaround!

geza-pycom commented 3 years ago

Did you find any solution to this problem? Using esp-idf 4.1 I also see the "Coex register schm btdm cb faild" message during esp_bt_controller_init() and then when I try to connect to the other device it crashes with: "Guru Meditation Error: Core 0 panic'ed (LoadProhibited). Exception was unhandled."

Backtrace: 0x401b544e: list_node at components/bt/common/osi/list.c:235 (discriminator 1) 0x401cb451: bta_gattc_co_cache_find_src_addr at components/bt/host/bluedroid/bta/gatt/bta_gattc_co.c:629 0x401cb522: cacheOpen at components/bt/host/bluedroid/bta/gatt/bta_gattc_co.c:126 0x401cb59d: bta_gattc_co_cache_open at components/bt/host/bluedroid/bta/gatt/bta_gattc_co.c:239 0x401c7d21: bta_gattc_cache_load at components/bt/host/bluedroid/bta/gatt/bta_gattc_cache.c:2127 0x401c9ab1: bta_gattc_conn at components/bt/host/bluedroid/bta/gatt/bta_gattc_act.c:674 0x401c584e: bta_gattc_sm_execute at components/bt/host/bluedroid/bta/gatt/bta_gattc_main.c:292 0x401c5959: bta_gattc_hdl_event at components/bt/host/bluedroid/bta/gatt/bta_gattc_main.c:404 0x401cc48d: bta_sys_event at components/bt/host/bluedroid/bta/sys/bta_sys_main.c:499 0x401b6b17: osi_thread_run at components/bt/common/osi/thread.c:68

WCCWCC commented 3 years ago

@geza-pycom We can't see the reason directly by the log, can you provide a step of repetition? If convenient can provide the code with the kconfig file.

jedi7 commented 3 years ago

Hi, the same problem here (idf v 3.3.3) I'm using ble gattc only ESP_BT_MODE_BLE (so no standard bt) And it looks like the function "btc_config_lock" is for standard BT, which is not initialized (by purpose) Enabling the SMP (but not using it) is really workaround for this issue.

geza-pycom commented 3 years ago

@WCCWCC : the problem I explained "solved" when setting CONFIG_BT_GATTC_CACHE_NVS_FLASH to False in the configuration, however the "Coex register schm btdm cb faild" message is still dropped and I would like to understand what it exactly means. SMP module is enabled and used.

xiewenxiang commented 3 years ago

@chegewara @geza-pycom @jedi7

Hi, The IDF version I tested was release/v3.3 commit: bf02206096, the example is blufi.

I changed some of the Menuconfig configuration items:

But it did not reproduce the problem.

There must be some difference between my operation and yours. I would like to confirm the following informations with you

jedi7 commented 3 years ago

@xiewenxiang Not sure if I understand. The blufi example uses security, right? I'm using bte without security (see config) This is my NOT working configuration: https://pastebin.com/iiLTcsQg because of the issue. I will try an example where is only bte, if it can be reproduced.

chegewara commented 3 years ago

Im no longer working on that project, so i cant confirm that SMP can be the problem.