Open sfzhi opened 3 weeks ago
@sfzhi I tried to reproduce with our i2c_eeprom_main example but failed. Could you please provide your sdkconfig?
IIRC,power-on reset will reset all condition, so after power-on reset, everything should begin as start. Some panic happen after power-on reset looks very strange to me even if GDB shows something. So 1. I want to see your configuration 2. See your reset reason
ESP-ROM:esp32c3-api1-20210207
Build:Feb 7 2021
rst:0x1 (POWERON),boot:0xc (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:1
Here you go:
ESP-ROM:esp32c3-api1-20210207
Build:Feb 7 2021
rst:0x1 (POWERON),boot:0xe (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:1
load:0x3fcd5820,len:0x1880
load:0x403cc710,len:0xd50
load:0x403ce710,len:0x2f50
entry 0x403cc71a
The sdkconfig
file: sdkconfig.txt.
I have conducted a few more experiments and the results are even more puzzling than before. When the panic occurs, if I don't wait until the device is restarted automatically (after 3 seconds, as per the configuration), but press the reset button again immediately, the device boots fine and the problem does not occur. I can't think of any reasonable explanation of what's happening.
Even more experiments resulted in the following observations and conclusions:
The problem is somehow tied to a specific layout of the binary image. It is 100% reproducible with some images and not reproducible at all with others. The difference between a "good" and a "bad" image can be as small as one function being two bytes longer/shorter - even for a function that is completely unrelated to I2C and is not even called before the problem occurs.
The problem is somehow tied to the reset type, but not always in the same way. With one image it will occur only after using the reset button, with another it will occur only after a reset via USB, with yet another it will occurs only after calling esp_restart(...)
. Combinations of these are also possible, but very rare.
The problem does not always manifest in exactly the same way. Most of the time, the panic occurs in the exact way as described above. However, in rare cases (meaning with some specific images) it manifests as an I2C transaction failure (without panic).
In one case (again, 100% reproducible with the given image), the panic occurred in the same spot in the code as described above, but on the second 32-bit chunk of data rather that the first one, with the destination pointer being NULL. That is a telling piece of evidence. It means the destination pointer magically turns into NULL between the first and the second 32-bit chunks. That suggests it may be some other (possibly unrelated) piece of code running at the same time (in another FreeRTOS task) that accidentally corrupts the internal state of the I2C driver.
By looking at what else might be running at the same time and moving those things one-by-one to a later time (after the I2C transaction in question) I managed to narrow it down to one piece that looks the most likely culprit:
wifi_init_config_t init_config = WIFI_INIT_CONFIG_DEFAULT();
ESP_ERROR_CHECK(esp_wifi_init(&init_config));
I have also noticed that the time the ESP32 takes to reach a certain point in the startup sequence depends on the reset type. The variation is not large (typically less than 0.1 seconds), but it's clearly measurable. That would explain the mysterious dependency on the reset type. Assuming the problem is indeed caused by WiFi initialization, a slight delay in when exactly the memory corruption occurs could make big difference w.r.t. how it affects the I2C transaction.
Unfortunately, much of the functionality of esp_wifi_init(...)
is hidden inside esp_wifi_init_internal(..)
, source code for which doesn't seem to be provided as a part of ESP-IDF, so I don't know how I can investigate this issue further.
Answers checklist.
IDF version.
5.2.1
Espressif SoC revision.
ESP32-C3 revision 0.4
Operating System used.
Linux
How did you build your project?
Command line with idf.py
If you are using Windows, please specify command line type.
None
Development Kit.
Seeed XIAO ESP32C3
Power Supply used.
USB
What is the expected behavior?
It should be possible to read a longer sequence of bytes from an external EEPROM over I2C.
What is the actual behavior?
Steps to reproduce.
i2c_new_master_bus(...)
- as usuallyi2c_master_bus_add_device(...)
- as usuallyFrom my observations, for the problem to occur, these two conditions must be true at the same time:
esp_restart()
.If both conditions are true, the problem is 100% reproducible.
Debug Logs.
More Information.
The relevant part of the code disassembled by GDB:
As can be seen from the register dump above, the exact spot where the problem occurs is
0x40382350 <+452>: sb a4,-1(a5)
, which is the assignment toptr[i]
insidei2c_ll_read_rxfifo(...)
. Apparently the destination address (stored ina5
and already incremented for the next loop iteration) is NULL.