espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.19k stars 7.17k forks source link

i2c Core 0 panic'ed (Load access fault) issue (IDFGH-8722) #10163

Closed AxelLin closed 1 year ago

AxelLin commented 1 year ago

Answers checklist.

General issue report

Module or chip used: ESP32-C3-DevKitM-1 IDF version: v5.1-dev-1751-gc09322430a Build System: idf.py Compiler version: riscv32-esp-elf-gcc (crosstool-NG esp-2022r1-RC1) 11.2.0 Operating System: Linux Power Supply: USB

Hi, Something in i2c was broken in the master tree. I got below panic after pulling the esp-idf update from the master tree. (My application was working before pulling the esp-idf update today).

Guru Meditation Error: Core 0 panic'ed (Load access fault). Exception was unhandled.

Stack dump detected Core 0 register dump: MEPC : 0x42049330 RA : 0x42049326 SP : 0x3fca33a0 GP : 0x3fc95800 0x42049330: i2c_ll_disable_intr_mask at /home/axel/esp/esp-idf/components/hal/esp32c3/include/hal/i2c_ll.h:229 (inlined by) i2c_param_config at /home/axel/esp/esp-idf/components/driver/i2c.c:760

0x42049326: i2c_param_config at /home/axel/esp/esp-idf/components/driver/i2c.c:760

TP : 0x3fc69ad4 T0 : 0x00000000 T1 : 0x4038f5d0 T2 : 0xffffffff 0x4038f5d0: gpspi_flash_ll_set_miso_bitlen at /home/axel/esp/esp-idf/components/hal/esp32c3/include/hal/gpspi_flash_ll.h:259 (inlined by) spi_flash_hal_gpspi_configure_host_io_mode at /home/axel/esp/esp-idf/components/hal/spi_flash_hal_common.inc:142

S0/FP : 0x3fca33d4 S1 : 0x0000000b A0 : 0x00000001 A1 : 0x00000001 A2 : 0x00000007 A3 : 0xffff0000 A4 : 0x00000001 A5 : 0x00000000 A6 : 0xa0000000 A7 : 0x0000000a S2 : 0x00000000 S3 : 0x3fc958e0 S4 : 0x00000000 S5 : 0x3fc958e0 S6 : 0x00000000 S7 : 0x00000000 S8 : 0x00000000 S9 : 0x00000000 S10 : 0x00000000 S11 : 0x00000000 T3 : 0x00000000 T4 : 0x00000002 T5 : 0x00000800 T6 : 0x0000000e MSTATUS : 0x00001881 MTVEC : 0x40380001 MCAUSE : 0x00000005 MTVAL : 0x00000028 0x40380001: _vector_table at ??:?

MHARTID : 0x00000000

Backtrace:

0x42049330 in i2c_ll_disable_intr_mask (mask=65535, hw=0x0) at /home/axel/esp/esp-idf/components/hal/esp32c3/include/hal/i2c_ll.h:229 229 hw->int_ena.val &= (~mask);

0 0x42049330 in i2c_ll_disable_intr_mask (mask=65535, hw=0x0) at /home/axel/esp/esp-idf/components/hal/esp32c3/include/hal/i2c_ll.h:229

1 i2c_param_config (i2c_num=, i2c_conf=i2c_conf@entry=0x3fca33d4) at /home/axel/esp/esp-idf/components/driver/i2c.c:760

2 0x42018af2 in i2c_setup_port (dev=dev@entry=0x3fc99ca8 ) at /home/axel/esp/esp-idf-dev/esp-idf-lib/components/i2cdev/i2cdev.c:208

3 0x42018d7e in i2c_dev_probe (dev=dev@entry=0x3fc99ca8 , operation_type=operation_type@entry=I2C_DEV_WRITE) at /home/axel/esp/esp-idf-dev/esp-idf-lib/components/i2cdev/i2cdev.c:246

AxelLin commented 1 year ago

I bisect to below bad commit which causes the esp32c3 panic:

803fc3fbe0d659aeb1bdc6eea4fb6cfafe72c6e1 is the first bad commit commit 803fc3fbe0d659aeb1bdc6eea4fb6cfafe72c6e1 Author: Cao Sen Miao caosenmiao@espressif.com Date: Wed Nov 2 18:09:22 2022 +0800

I2C: Add i2c support for ESP32C6
AxelLin commented 1 year ago

The problem is if a i2c device is not available, it hit panic in current master tree now. It won't hit panic in previous esp-idf versions, so I can scan the known address to figure out which sensors are available.

mythbuster5 commented 1 year ago

@AxelLin Yes, I added hardware pointer to NULL if I2C controller deleted. So, basically, I think i2c_driver_install and i2c_driver_delete in a pair. For probe example, you can see here https://github.com/espressif/esp-idf/tree/master/examples/peripherals/i2c/i2c_tools . Anyway, please check where you call i2c_driver_delete where unnecessary.

AxelLin commented 1 year ago

@mythbuster5

In previous esp-idf versions, it's fine to init with below calls: i2c_param_config() i2c_driver_install()

But now it needs to call with below order (otherwise hit the panic). i2c_driver_install() i2c_param_config()

Such change breaks existing code (e.g. out-of-tree library: https://github.com/UncleRus/esp-idf-lib/blob/master/components/i2cdev/i2cdev.c#L207-L212 )

BTW, if the order of above calls does matter, please make it clear in your API documentation.

AxelLin commented 1 year ago

@AxelLin Yes, I added hardware pointer to NULL if I2C controller deleted.

@mythbuster5 This indeed causes this regression. Calling i2c_param_config() after i2c_driver_delete() will trigger this issue.

i.e. below call sequence will cause panic if a sensor is not available:

for_each_i2c_device() {
        i2c_param_config()
        i2c_driver_install()
        ...
        i2c_driver_delete()
}

For testing , I comment out i2c_hal_deinit(&i2c_context[i2c_num].hal); in i2c_driver_delete() then I no longer hit this panic.

AxelLin commented 1 year ago

@mythbuster5 Any update for this regression fix?

ryanminnig commented 1 year ago

I also have the same issue. I had though initially that it was my adding in of espcoredump component to capture unrelated panic issues to flash. Because we had used PSRAM in some of our task stacks, we temporarily cut stack size for LVGL and frame buffers for LVGL because the coredump to flash is not available if PSRAM is set to be used with the common heap. I had inadvertently thought that and IDF framework update was necessary (we were on the version from Aug '22) and never thought that I2C drivers would have changed that much. We also used the "Uncle Rus" I2Cdev component from github. At the same time, we had consolidated all of our separate I2C devices from separate tasks (because of the thread safe drivers from Uncle Rus) and consolidated them to one task for the purposes of interrupt/poll read timing.

We are on an ESP32S3 with 16MBflash and 8MB PSRAM.

I can confirm that "comment out i2c_hal_deinit(&i2c_context[i2c_num].hal); in i2c_driver_delete()" works

AxelLin commented 1 year ago

@mythbuster5

There are several issues with commit 803fc3f.

  1. A commit simply says "I2C: Add i2c support for ESP32C6" should not have impact with other SoCs. Apparently this commit is not just adding support for new SoC.
  2. You should not add change for "setting hardware pointer to NULL" without mentioned it in commit log. If it's a necessary change while adding new SoC support, you should mention why it's need. In additional, you should mention that what was fixed if it's a bug fix. (If it's not a bug fix, why do you change it?)
  3. The "setting hardware pointer to NULL" causes regression on esp32c3 (and maybe other SoCs, e.g. esp32s3). You got the issue report, but then there is no further response to fix it. Why? IMHO, Fixing regression should be always a high priority.

@igrr @Alvin1Zhang

mythbuster5 commented 1 year ago

@AxelLin Generally driver_install should be ahead of param_set. Maybe the previous writer or old driver is so confusing that some user use param set before driver install.

I'm sorry for busy on other high priority tasks these days. I will find time to get this one fixed.

AxelLin commented 1 year ago

@mythbuster5

So you even didn't read my comment https://github.com/espressif/esp-idf/issues/10163#issuecomment-1314579110 .

There could be multiple i2c devices on the bus: For each known address , we do probe device as below:

for_each_possible_i2c_device() {
        i2c_param_config()
        i2c_driver_install()
        ...
        i2c_driver_delete()
}

Assume the first device is not on the bus, in the 2nd iteration the i2c_param_config() is called after i2c_driver_delete(). Then it panic!!

AxelLin commented 1 year ago

@AxelLin Generally driver_install should be ahead of param_set. Maybe the previous writer or old driver is so confusing that some user use param set before driver install.

Actually, it's not about confusing at all. It was working in older esp-idf versions, that's why I called it as a regression. And honestly I feel it's strange you need to call driver_install before param_set. You can also check the usage in esp-idf, some of it calls param_set before driver_install.

mythbuster5 commented 1 year ago

@AxelLin Will get it fixed, thanks