espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.1k stars 7.15k forks source link

Problem with I2C and BLE HLI (IDFGH-12577) #13579

Open fralmi1 opened 3 months ago

fralmi1 commented 3 months ago

Answers checklist.

IDF version.

v4.4.1

Espressif SoC revision.

ESP32-WROVER-E

Operating System used.

Windows

How did you build your project?

VS Code IDE

If you are using Windows, please specify command line type.

PowerShell

Development Kit.

ESP32-LyraT

Power Supply used.

External 5V

What is the expected behavior?

I expected to not get any I2C read errors.

What is the actual behavior?

Instead the read I2C values sometimes differ from the expected one. This occurs once every few minutes.

Steps to reproduce.

  1. Run modified Bluetooth throughput_server_demo (not client needed)
  2. After several minutes an I2C read error occurs

Debug Logs.

I (27) boot: ESP-IDF v4.4.1 2nd stage bootloader
I (27) boot: compile time 13:17:57
I (27) boot: chip revision: 3
I (30) boot_comm: chip revision: 3, min. bootloader chip revision: 0
I (37) boot.esp32: SPI Speed      : 40MHz
I (42) boot.esp32: SPI Mode       : DIO
I (46) boot.esp32: SPI Flash Size : 2MB
I (51) boot: Enabling RNG early entropy source...
I (56) boot: Partition Table:
I (60) boot: ## Label            Usage          Type ST Offset   Length
I (67) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (75) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (82) boot:  2 factory          factory app      00 00 00010000 00100000
I (90) boot: End of partition table
I (94) boot_comm: chip revision: 3, min. application chip revision: 0
I (101) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=1845ch ( 99420) map
I (145) esp_image: segment 1: paddr=00028484 vaddr=3ffbdb60 size=04750h ( 18256) load
I (153) esp_image: segment 2: paddr=0002cbdc vaddr=40080000 size=0343ch ( 13372) load
I (159) esp_image: segment 3: paddr=00030020 vaddr=400d0020 size=5e2d4h (385748) map
I (299) esp_image: segment 4: paddr=0008e2fc vaddr=4008343c size=14d14h ( 85268) load
I (335) esp_image: segment 5: paddr=000a3018 vaddr=50000000 size=00010h (    16) load
I (347) boot: Loaded app from partition at offset 0x10000
I (347) boot: Disabling RNG early entropy source...
I (358) cpu_start: Pro cpu up.
I (359) cpu_start: Starting app cpu, entry point is 0x400812dc
I (0) cpu_start: App cpu up.
I (373) cpu_start: Pro cpu start user code
I (373) cpu_start: cpu freq: 160000000
I (373) cpu_start: Application information:
I (377) cpu_start: Project name:     throughput_server_demo
I (383) cpu_start: App version:      1
I (388) cpu_start: Compile time:     Apr 10 2024 07:17:47
I (394) cpu_start: ELF file SHA256:  d5e3acbf0064a8ae...
I (400) cpu_start: ESP-IDF:          v4.4.1
I (405) heap_init: Initializing. RAM available for dynamic allocation:
I (412) heap_init: At 3FFAFF10 len 000000F0 (0 KiB): DRAM
I (418) heap_init: At 3FFB6388 len 00001C78 (7 KiB): DRAM
I (424) heap_init: At 3FFB9A20 len 00004108 (16 KiB): DRAM
I (430) heap_init: At 3FFC5F38 len 0001A0C8 (104 KiB): DRAM
I (437) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (443) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (449) heap_init: At 40098150 len 00007EB0 (31 KiB): IRAM
I (457) spi_flash: detected chip: gd
I (460) spi_flash: flash io: dio
W (464) spi_flash: Detected size(8192k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I (478) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
I (559) BTDM_INIT: BT controller compile version [5688ed5]
I (561) system_api: Base MAC address is not set
I (561) system_api: read default base MAC address from EFUSE
I (569) phy_init: phy_version 4670,719f9f6,Feb 18 2021,17:07:07
I (966) GATTS_DEMO: REGISTER_APP_EVT, status 0, app_id 0

W (968) BT_BTM: BTM_BleWriteAdvData, Partial data write into ADV
W (972) BT_BTM: BTM_BleWriteScanRsp, Partial data write into ADV
I (980) GATTS_DEMO: CREATE_SERVICE_EVT, status 0,  service_handle 40

I (1001) GATTS_DEMO: SERVICE_START_EVT, status 0, service_handle 40

I (1002) GATTS_DEMO: ADD_CHAR_EVT, status 0,  attr_handle 42, service_handle 40

I (1006) GATTS_DEMO: the gatts demo char length = 3

I (1012) GATTS_DEMO: prf_char[0] =11

I (1016) GATTS_DEMO: prf_char[1] =22

I (1021) GATTS_DEMO: prf_char[2] =33

I (1026) GATTS_DEMO: ADD_DESCR_EVT, status 0, attr_handle 43, service_handle 40

I (10318) I2C_TASK: error occured: 9f
I (117358) I2C_TASK: error occured: 9f
I (131489) I2C_TASK: error occured: 9f
I (303297) I2C_TASK: error occured: 9f
I (403248) I2C_TASK: error occured: 9f
I (427003) I2C_TASK: error occured: 9f
I (571707) I2C_TASK: error occured: 9f
I (584379) I2C_TASK: error occured: 9f
I (738762) I2C_TASK: error occured: 9f
I (762700) I2C_TASK: error occured: 9f
I (840181) I2C_TASK: error occured: 9f
I (1007911) I2C_TASK: error occured: 9f
I (1030511) I2C_TASK: error occured: 9f
I (1112987) I2C_TASK: error occured: 9f
I (1195874) I2C_TASK: error occured: 9f

More Information.

This problem does not occur with IDF v4.3.7 (tested over a long time). I therefore bisected the problem and it lead to one specific commit: d1db2df31670994a5ca60b5d2227f10bb77d9b16 In this commit there were various changes in the bluetooth component including switching to high level interrupts. After disabling CONFIG_BTDM_CTRL_HLI the error does not occur anymore. While this is a temporary fix, we do not know what other side effects this fix has in our firmware. I therefore think this should also be fixed with HLI, since it appears that this is the default setting.

I also found two issues which may be related: https://github.com/espressif/esp-idf/issues/7781 https://github.com/espressif/esp-idf/issues/12494

fralmi1 commented 3 months ago

example_ble_server_throughput.zip

Here are the modified files for the example_ble_server_throughput project.

redfast00 commented 2 months ago

Might also be related to #12860, where we also sometime have I2C data corruption every few minutes (even though the data on the I2C lines measured with a scope is correct). @fralmi1: does your I2C traffic/errors match the pattern described in https://github.com/espressif/esp-idf/issues/12860#issuecomment-1994040455 (so the corrupted read byte is the same as the byte before)?