espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.31k stars 7.2k forks source link

ESP32-C3 nvs_flash_init() hangs after deep sleep when encryption enabled (IDFGH-12847) #13813

Open funkBuild opened 3 months ago

funkBuild commented 3 months ago

Answers checklist.

IDF version.

v5.2.1

Espressif SoC revision.

ESP32-C3

Operating System used.

Linux

How did you build your project?

Command line with Make

If you are using Windows, please specify command line type.

None

Development Kit.

Custom board

Power Supply used.

External 3.3V

What is the expected behavior?

nvs_flash_init() returns in a timely manner

What is the actual behavior?

nvs_flash_init() hangs after the device has gone into deep sleep and woken up many times (around 12, but it doesn't appear repeatable). The RTC watchdog resets the device when this happens (set to 30 second timeout) but the device just immediately hangs again when calling nvs_flash_init(). If I reset the device using a programmer (no power cycle, just toggling the En pin) the device is able to correctly boot again but then repeats the failure after deep sleeping many times.

Steps to reproduce.

  1. Have Flash encryption and Secure boot enabled
  2. Call nvs_flash_init() on boot
  3. Call esp_deep_sleep()

Debug Logs.

ESP-ROM:esp32c3-api1-20210207

Build:Feb  7 2021

rst:0x5 (DSLEEP),boot:0xc (SPI_FAST_FLASH_BOOT)

SPIWP:0xee

mode:DIO, clock div:1

Valid secure boot key blocks: 0

secure boot verification succeeded

load:0x3fcd5988,len:0xc70

load:0x403cc710,len:0x860

load:0x403ce710,len:0x4f08

entry 0x403cc710

I (282) flash_encrypt: Flash encryption mode is RELEASE

I (289) pm: Frequency switching config: CPU_MAX: 80, APB_MAX: 80, APB_MIN: 40, Light sleep: DISABLED

I (289) main_task: Started on CPU0

I (289) main_task: Calling app_main()

I (289) pm: Frequency switching config: CPU_MAX: 80, APB_MAX: 80, APB_MIN: 10, Light sleep: DISABLED

I (299) main: Initializing NVS

I (309) nvs: NVS encr-keys are read from flash.

I (309) nvs: nvs_flash_secure_init_partition nvs

I (319) nvs: nvs_flash_secure_init_partition lock_result=0

I (319) nvs: nvs_flash_secure_init_partition lock acquired

I (429) nvs: nvs_flash_secure_init ret=0

I (429) nvs: NVS partition "nvs" is encrypted.

I (429) main: Initializing NVS ret=0

I (439) main: configuration_init

I (469) main: time_init

I (469) time: Updating time from RTC

I (469) time: The current date/time is: Sat May 18 17:55:00 2024

I (469) main: app_init

I (479) flash_log: Initializing flash log. Item size: 16 bytes

I (479) modbus: Modbus interface mocked for development

I (489) gpio: GPIO[7]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0 

I (499) gpio: GPIO[10]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0 

I (509) gpio: GPIO[4]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0 

I (519) gpio: GPIO[0]| InputEn: 1| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:3 

I (529) gpio: GPIO[1]| InputEn: 1| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:3 

I (529) gpio: GPIO[2]| InputEn: 1| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:3 

I (539) io: Digital 1: 1, Digital 2: 1, Hall: 1

I (549) gpio: GPIO[3]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0 

I (559) app_mgr: Run callback sample

I (559) app: Last meter reading timestamp: 1716018600

I (569) app: Last meter reading event: 0

E (569) app: Timestamp: 1716018900

I (579) events: Emitting event flow_start

I (579) events: Emitting event low_battery_clear

I (589) events: Emitting event modbus_timeout_clear

I (589) events: Emitting event modbus_crc_error_clear

I (599) app: Creating log entry due to flow change

I (609) flash_log: Wrote new log item. head=0x1d20 tail=0x0 timestamp=1716018900 percentage_full=0.07

I (609) app: Pending events: 0

I (619) app: Last meter reading timestamp: 1716018900

I (619) app: Last meter reading event: 0

I (629) app: Sleeping 300 seconds until next reading

I (629) app_mgr: Task sample sleeping for 300 seconds

I (639) app_mgr: Device sleep time 300 seconds

I (639) app_mgr: Wait for wan powerdown

I (649) app_mgr: Going into deep sleep for 300 seconds

ESP-ROM:esp32c3-api1-20210207

Build:Feb  7 2021

rst:0x5 (DSLEEP),boot:0xc (SPI_FAST_FLASH_BOOT)

SPIWP:0xee

mode:DIO, clock div:1

Valid secure boot key blocks: 0

secure boot verification succeeded

load:0x3fcd5988,len:0xc70

load:0x403cc710,len:0x860

load:0x403ce710,len:0x4f08

entry 0x403cc710

I (282) flash_encrypt: Flash encryption mode is RELEASE

I (289) pm: Frequency switching config: CPU_MAX: 80, APB_MAX: 80, APB_MIN: 40, Light sleep: DISABLED

I (289) main_task: Started on CPU0

I (289) main_task: Calling app_main()

I (289) pm: Frequency switching config: CPU_MAX: 80, APB_MAX: 80, APB_MIN: 10, Light sleep: DISABLED

I (299) main: Initializing NVS

I (309) nvs: NVS encr-keys are read from flash.

I (309) nvs: nvs_flash_secure_init_partition nvs

I (319) nvs: nvs_flash_secure_init_partition lock_result=0

I (319) nvs: nvs_flash_secure_init_partition lock acquired

ESP-ROM:esp32c3-api1-20210207

Build:Feb  7 2021

rst:0xc (RTC_SW_CPU_RST),boot:0xc (SPI_FAST_FLASH_BOOT)

Saved PC:0x4038068c

SPIWP:0xee

mode:DIO, clock div:1

Valid secure boot key blocks: 0

secure boot verification succeeded

load:0x3fcd5988,len:0xc70

load:0x403cc710,len:0x860

load:0x403ce710,len:0x4f08

entry 0x403cc710

I (286) flash_encrypt: Flash encryption mode is RELEASE

I (292) pm: Frequency switching config: CPU_MAX: 80, APB_MAX: 80, APB_MIN: 40, Light sleep: DISABLED

I (293) main_task: Started on CPU0

I (293) main_task: Calling app_main()

I (293) pm: Frequency switching config: CPU_MAX: 80, APB_MAX: 80, APB_MIN: 10, Light sleep: DISABLED

I (303) main: Initializing NVS

I (313) nvs: NVS encr-keys are read from flash.

I (313) nvs: nvs_flash_secure_init_partition nvs

I (323) nvs: nvs_flash_secure_init_partition lock_result=0

I (323) nvs: nvs_flash_secure_init_partition lock acquired

ESP-ROM:esp32c3-api1-20210207

Build:Feb  7 2021

rst:0xc (RTC_SW_CPU_RST),boot:0xc (SPI_FAST_FLASH_BOOT)

Saved PC:0x4038068c

SPIWP:0xee

mode:DIO, clock div:1

Valid secure boot key blocks: 0

secure boot verification succeeded

load:0x3fcd5988,len:0xc70

load:0x403cc710,len:0x860

load:0x403ce710,len:0x4f08

entry 0x403cc710

I (286) flash_encrypt: Flash encryption mode is RELEASE

I (292) pm: Frequency switching config: CPU_MAX: 80, APB_MAX: 80, APB_MIN: 40, Light sleep: DISABLED

I (293) main_task: Started on CPU0

I (293) main_task: Calling app_main()

I (293) pm: Frequency switching config: CPU_MAX: 80, APB_MAX: 80, APB_MIN: 10, Light sleep: DISABLED

I (303) main: Initializing NVS

I (313) nvs: NVS encr-keys are read from flash.

I (313) nvs: nvs_flash_secure_init_partition nvs

I (323) nvs: nvs_flash_secure_init_partition lock_result=0

I (323) nvs: nvs_flash_secure_init_partition lock acquired

More Information.

So far I've ruled out -

The only thing that's made a difference is moving back to v5.1.4. The board has run perfectly overnight where it normally fails within an hour or two. Issue has never occurred on development boards that don't use encryption.

esp-wzh commented 3 months ago

Hi, @funkBuild Do you know what the deepsleep wake-up source was when the problem occurred? If there are multiple wake-up sources, can you add some logs to obtain the wake-up source information before initializing nvs?

One known problem is that the SPI Flash will be powered off during deepsleep. If deepsleep uses an IO wake-up source, and the chip is woken up by IO when it just goes to sleep, the Flash will be powered on immediately. On some Flash chips, the power on timing will be violated, there is a small probability that the Flash chip will enter an error state.

funkBuild commented 3 months ago

Hi, @funkBuild Do you know what the deepsleep wake-up source was when the problem occurred? If there are multiple wake-up sources, can you add some logs to obtain the wake-up source information before initializing nvs?

In this case it's always the deep sleep timer i.e. calling esp_deep_sleep() and waiting for the device to naturally wake up. I do have two GPIO's setup to wake the device but they're pulled high and left disconnected from the buttons that would trigger a wakeup. The code also doesn't allow for short sleep durations (under 5 seconds it just stays awake).

Is there a way to check if the flash power on bug is occurring here? I recall reading there's an efuse for setting the delay. If it was the flash, I would have thought that it wouldn't execute my application at all i.e. it wouldn't be able to execute esp_pm_configure() and the ESP_LOGI statements before calling nvs_flash_init()

esp-wzh commented 3 months ago

Is there a way to check if the flash power on bug is occurring here?

There is no status on the hardware indicating this error, but if you can ensure that each deepsleep time is greater than 100ms, then you can rule out this possibility.

I recall reading there's an efuse for setting the delay.

I guess you are referring to EFUSE_FLASH_TPUW, but it is used to constrain the timing between Flash power-on and SPI CS first time pulled down, and has nothing to do with the suspected problem.

BTW, can you provide a minimal reproducible case?

funkBuild commented 3 months ago

After doing more testing on 5.1.4, we weren't able to see the error causing the nvs_init() lockup (after 1.5 weeks of runtime) but found a new problem with encrypted flash writes. Initialising the NVS keys partition now fails and OTA updates fail with a similar error message. Moving back to 5.1.3 fixes the error.

@esp-wzh Still working on getting reproduction code to you, just need to get the project working correctly and tested first.

I (314) cpu_start: Unicore app
I (314) cpu_start: Pro cpu up.
I (323) cpu_start: Pro cpu start user code
I (323) cpu_start: cpu freq: 80000000 Hz
I (323) cpu_start: Application information:
I (326) cpu_start: Project name:     test_project
I (331) cpu_start: App version:      1.0.0-gmw
I (336) cpu_start: Compile time:     May 30 2024 11:24:00
I (342) cpu_start: ELF file SHA256:  163032c532db0edc...
I (348) cpu_start: ESP-IDF:          v5.1.4-dirty
I (354) cpu_start: Min chip rev:     v0.3
I (358) cpu_start: Max chip rev:     v1.99 
I (363) cpu_start: Chip rev:         v0.3
I (368) heap_init: Initializing. RAM available for dynamic allocation:
I (375) heap_init: At 3FCA08E0 len 0001F720 (125 KiB): DRAM
I (382) heap_init: At 3FCC0000 len 0001C710 (113 KiB): DRAM/RETENTION
I (389) heap_init: At 3FCDC710 len 00002950 (10 KiB): DRAM/RETENTION/STACK
I (396) heap_init: At 500000F0 len 00001EE8 (7 KiB): RTCRAM
I (403) spi_flash: detected chip: winbond
I (407) spi_flash: flash io: qio
W (411) flash_encrypt: Flash encryption mode is DEVELOPMENT (not secure)
W (419) ADC: legacy driver is deprecated, please migrate to `esp_adc/adc_oneshot.h`
I (427) sleep: Configure to isolate all GPIO pins in sleep state
I (434) sleep: Enable automatic switching of GPIO sleep configuration
I (452) app_start: Starting scheduler on CPU0
I (00:00:00.129) main_task: Started on CPU0
I (00:00:00.130) main_task: Calling app_main()
I (00:00:00.133) pm: Frequency switching config: CPU_MAX: 80, APB_MAX: 80, APB_MIN: 40, Light sleep: DISABLED
I (00:00:00.144) nvs: NVS key partition empty, generating keys
Guru Meditation Error: Core  0 panic'ed (Illegal instruction). Exception was unhandled.

Core  0 register dump:
Stack dump detected
MEPC    : 0x42027bbe  RA      : 0x40384160  SP      : 0x3fca34b0  GP      : 0x3fc99800  
0x42027bbe: spi_flash_mmap at /home/matt/esp/esp-idf/components/spi_flash/flash_mmap.c:56
0x40384160: esp_flash_read_encrypted at /home/matt/esp/esp-idf/components/spi_flash/esp_flash_api.c:1152

TP      : 0x3fc23484  T0      : 0x4005890e  T1      : 0x00000005  T2      : 0xffffffff  
0x4005890e: memset in ROM

S0/FP   : 0x00000000  S1      : 0x00000040  A0      : 0x00450000  A1      : 0x00000050  
A2      : 0x00000000  A3      : 0x3fca34c8  A4      : 0x3fca34cc  A5      : 0x00450050  
A6      : 0xbfffffff  A7      : 0x00000000  S2      : 0x00000010  S3      : 0x3fca34f0  
S4      : 0x00000040  S5      : 0x00450040  S6      : 0x3fca3570  S7      : 0x3fc9a914  
S8      : 0x3fc9b000  S9      : 0x000235f0  S10     : 0x00060000  S11     : 0x0000001f  
T3      : 0xffff0000  T4      : 0x000000ff  T5      : 0xffffff99  T6      : 0x00000000  
MSTATUS : 0x00001881  MTVEC   : 0x40380001  MCAUSE  : 0x00000002  MTVAL   : 0x00000000  
0x40380001: _vector_table at ??:?

MHARTID : 0x00000000  

Backtrace:

spi_flash_mmap (src_addr=src_addr@entry=4521984, size=size@entry=80, memory=memory@entry=SPI_FLASH_MMAP_DATA, out_ptr=out_ptr@entry=0x3fca34c8, out_handle=out_handle@entry=0x3fca34cc) at /home/matt/esp/esp-idf/components/spi_flash/flash_mmap.c:59
59      void *ptr = NULL;
#0  spi_flash_mmap (src_addr=src_addr@entry=4521984, size=size@entry=80, memory=memory@entry=SPI_FLASH_MMAP_DATA, out_ptr=out_ptr@entry=0x3fca34c8, out_handle=out_handle@entry=0x3fca34cc) at /home/matt/esp/esp-idf/components/spi_flash/flash_mmap.c:59
#1  0x40384160 in esp_flash_read_encrypted (chip=<optimized out>, chip@entry=0x3fc9a914 <default_chip>, address=address@entry=4522048, out_buffer=out_buffer@entry=0x3fca34f0, length=length@entry=16) at /home/matt/esp/esp-idf/components/spi_flash/esp_flash_api.c:1151
#2  0x4038425a in s_verify_write (chip=0x3fc9a914 <default_chip>, verify_address=verify_address@entry=4522048, remain_verify_len=remain_verify_len@entry=16, expected_buf=expected_buf@entry=0x3fca3570, is_encrypted=is_encrypted@entry=true) at /home/matt/esp/esp-idf/components/spi_flash/esp_flash_api.c:975
#3  0x4038457e in esp_flash_write_encrypted (chip=<optimized out>, address=4522048, buffer=0x3fca35f0, length=16) at /home/matt/esp/esp-idf/components/spi_flash/esp_flash_api.c:1380
#4  0x4202aacc in nvs_flash_generate_keys (cfg=<optimized out>, partition=<optimized out>) at /home/matt/esp/esp-idf/components/nvs_flash/src/nvs_api.cpp:627
#5  nvs_flash_generate_keys (partition=0x3fca4358, cfg=0x3fca3620) at /home/matt/esp/esp-idf/components/nvs_flash/src/nvs_api.cpp:572
#6  0x4202ad42 in nvs_flash_init () at /home/matt/esp/esp-idf/components/nvs_flash/src/nvs_api.cpp:143
#7  0x42005728 in nvs_init () at /home/matt/Desktop/source/test_project_firmware/main/main.c:26
#8  app_main () at /home/matt/Desktop/source/test_project_firmware/main/main.c:48
#9  0x420cff54 in main_task (args=<optimized out>) at /home/matt/esp/esp-idf/components/freertos/app_startup.c:208
#10 0x00000000 in ?? ()
Backtrace stopped: frame did not save the PC
ELF file SHA256: 163032c532db0edc

Rebooting...
I (314) cpu_start: Unicore app
I (314) cpu_start: Pro cpu up.
I (323) cpu_start: Pro cpu start user code
I (323) cpu_start: cpu freq: 80000000 Hz
I (323) cpu_start: Application information:
I (326) cpu_start: Project name:     test_project
I (331) cpu_start: App version:      1.0.0-gmw
I (336) cpu_start: Compile time:     May 30 2024 11:24:00
I (342) cpu_start: ELF file SHA256:  163032c532db0edc...
I (348) cpu_start: ESP-IDF:          v5.1.4-dirty
I (354) cpu_start: Min chip rev:     v0.3
I (358) cpu_start: Max chip rev:     v1.99 
I (363) cpu_start: Chip rev:         v0.3
I (368) heap_init: Initializing. RAM available for dynamic allocation:
I (375) heap_init: At 3FCA08E0 len 0001F720 (125 KiB): DRAM
I (382) heap_init: At 3FCC0000 len 0001C710 (113 KiB): DRAM/RETENTION
I (389) heap_init: At 3FCDC710 len 00002950 (10 KiB): DRAM/RETENTION/STACK
I (396) heap_init: At 500000F0 len 00001EE8 (7 KiB): RTCRAM
I (403) spi_flash: detected chip: winbond
I (407) spi_flash: flash io: qio
W (411) flash_encrypt: Flash encryption mode is DEVELOPMENT (not secure)
W (419) ADC: legacy driver is deprecated, please migrate to `esp_adc/adc_oneshot.h`
I (427) sleep: Configure to isolate all GPIO pins in sleep state
I (434) sleep: Enable automatic switching of GPIO sleep configuration
I (452) app_start: Starting scheduler on CPU0
I (00:00:01.027) main_task: Started on CPU0
I (00:00:01.028) main_task: Calling app_main()
I (00:00:01.030) pm: Frequency switching config: CPU_MAX: 80, APB_MAX: 80, APB_MIN: 40, Light sleep: DISABLED
I (00:00:01.159) nvs: NVS partition "nvs" is encrypted.

Firmware update case

I (15:14:55.224) OTA: esp_ota_begin succeeded
Guru Meditation Error: Core  0 panic'ed (Illegal instruction). Exception was unhandled.

Stack dump detected
Core  0 register dump:
MEPC    : 0x420272ce  RA      : 0x40384192  SP      : 0x3fcc35b0  GP      : 0x3fc99600  
0x420272ce: spi_flash_mmap at /home/matt/esp/esp-idf/components/spi_flash/flash_mmap.c:56
0x40384192: esp_flash_read_encrypted at /home/matt/esp/esp-idf/components/spi_flash/esp_flash_api.c:1152

TP      : 0x3fc452fc  T0      : 0x40394adc  T1      : 0x00000005  T2      : 0xffffffff  
0x40394adc: calloc at /home/matt/esp/esp-idf/components/newlib/heap.c:30

S0/FP   : 0x00000000  S1      : 0x00000000  A0      : 0x00220000  A1      : 0x00000020  
A2      : 0x00000000  A3      : 0x3fcc35c8  A4      : 0x3fcc35cc  A5      : 0x00220020  
A6      : 0xbfffffff  A7      : 0x00000000  S2      : 0x00000020  S3      : 0x3fcc35f0  
S4      : 0x00000040  S5      : 0x00220000  S6      : 0x3fcc3670  S7      : 0x3fc9a5f4  
S8      : 0x3fc9a000  S9      : 0x00043a20  S10     : 0x00060000  S11     : 0x0000001f  
T3      : 0xffff0000  T4      : 0x000000ff  T5      : 0x00000000  T6      : 0x00000000  
MSTATUS : 0x00001881  MTVEC   : 0x40380001  MCAUSE  : 0x00000002  MTVAL   : 0x00000000  
0x40380001: _vector_table at ??:?

MHARTID : 0x00000000  

Backtrace:

spi_flash_mmap (src_addr=src_addr@entry=2228224, size=size@entry=32, memory=memory@entry=SPI_FLASH_MMAP_DATA, out_ptr=out_ptr@entry=0x3fcc35c8, out_handle=out_handle@entry=0x3fcc35cc) at /home/matt/esp/esp-idf/components/spi_flash/flash_mmap.c:59
59      void *ptr = NULL;
#0  spi_flash_mmap (src_addr=src_addr@entry=2228224, size=size@entry=32, memory=memory@entry=SPI_FLASH_MMAP_DATA, out_ptr=out_ptr@entry=0x3fcc35c8, out_handle=out_handle@entry=0x3fcc35cc) at /home/matt/esp/esp-idf/components/spi_flash/flash_mmap.c:59
#1  0x40384192 in esp_flash_read_encrypted (chip=<optimized out>, chip@entry=0x3fc9a5f4 <default_chip>, address=address@entry=2228224, out_buffer=out_buffer@entry=0x3fcc35f0, length=length@entry=32) at /home/matt/esp/esp-idf/components/spi_flash/esp_flash_api.c:1151
#2  0x4038428c in s_verify_write (chip=0x3fc9a5f4 <default_chip>, verify_address=verify_address@entry=2228224, remain_verify_len=remain_verify_len@entry=32, expected_buf=expected_buf@entry=0x3fcc3670, is_encrypted=is_encrypted@entry=true) at /home/matt/esp/esp-idf/components/spi_flash/esp_flash_api.c:975
#3  0x403845b0 in esp_flash_write_encrypted (chip=<optimized out>, address=2228224, buffer=0x3fcc3a20, length=1024) at /home/matt/esp/esp-idf/components/spi_flash/esp_flash_api.c:1380
#4  0x42025c0c in esp_partition_write (partition=<optimized out>, dst_offset=<optimized out>, src=src@entry=0x3fcc3a20, size=size@entry=1024) at /home/matt/esp/esp-idf/components/esp_partition/partition_target.c:82
#5  0x42026390 in esp_ota_write (handle=<optimized out>, data=data@entry=0x3fcc3a20, size=size@entry=1024) at /home/matt/esp/esp-idf/components/app_update/esp_ota_ops.c:259
#6  0x4200d84c in firmware_post_handler (req=0x3fcc3e7c) at /home/matt/Desktop/source/test_project_firmware/main/api/firmware_api.c:107
#7  0x00000000 in ?? ()
Backtrace stopped: frame did not save the PC
ELF file SHA256: d8b59a74f01a8b2e