Open cjstott94 opened 2 years ago
Reminds of this. @igrr Thoughts? Also, in case this was implemented or it's on a roadmap somewhere (hopefully), was/will it be backported to release/v3.3?
I've so far narrowed where it's halting down to esp_panic_handler() =>esp_core_dump_to_flash() ==>esp_core_dump_write() ===>esp_core_dump_write_elf() ====>esp_core_dump_do_write_elf_pass() =====>elf_write_core_dump_user_data() ======>elf_add_segment()? maybe even esp_core_dump_flash_write_data?
Though curiously, if I force a crash by inserting code like this just before the call to esp_core_dump_do_write_elf_pass
*(int *)0 = 0;
A StoreProhibited exception is triggered and then the second panic handler correctly displays
Re-entered core dump! Exception happened during core dump!
(Though isn't a crash during the panic handler meant to trigger a DoubleException?)
The app Still halts if that crashing code is inserted after the call to esp_core_dump_do_write_elf_pass
I also tried updating to ESP-IDF 4.3.1 and this fixed the problem Though I cant tell if it fixed the underlying problem during a core-dump as I can no longer trigger the initial exception
I couldn't find any changes to the coredump panic handler in 4.3.1 here
If it helps I'm also using Encrypted flash and making a release build
Hi @cjstott94
Just a note about the former exception. Have you correctly deinitialized the PPP component as indicated here:
after the OTA update?
Just assume that the receive callback could be called normally (maybe after entering esp_restart()
function) before performing a restart.
Yes that was the problem, not de-initializing properly Fixed that issue already, just concerned whether a similar error could cause the processor to lock up again If it locks up then there's no chance we'll be able to do any OTA bug fixes
Environment
Problem Description
Processor hang during crash
Expected Behavior
Any crashes should result in CPU restarting
Actual Behavior
Processor Halts
Steps to reproduce
I'm Guessing that most likely something happens when saving a coredump in elf format to flash similar to this issue #6519 LWIP stack has 4K of memory allocated Though I don't get an indication of a double exception over uart (We don't have an easy way of hooking up jtag ATM)
I can probably fix this specific crash by ensuring the modem/ppp is cleaned up properly before a restart and/or by increasing the stack size to give extra overhead for the coredump as suggested here @gerekon in https://github.com/espressif/esp-idf/issues/6519#issuecomment-779968116_
Though my main concern is if a problem is occurring during the coredump, why is it not doing some sort of hard reset? It's critical that the application reboot on any sort of exception.
Is there a way we could adjust the _DoubleExceptionVector so that it instead reboots straight away instead of invoking the panic handler?
Or is there any other known workaround? Maybe skipping core dump if there isn't enough stack space?
Other items if possible