Open andy-danieal opened 4 months ago
@andy-danieal
Can you provide the .elf
file when the issue happen?
I have sent you an email. Please check email and let me know if you have any questions.
@andy-danieal
I didn't receive any email that include a .elf
file.
We sent you an email Ma li at Jul 1, 2024. Please check that email.
Also, We having the same issue with the Root Mesh node. After being active for two days, it stops working. Each 10 seconds, 30 ESP devices send a command and receive a response.
I am Ma Li. I didn't receive your email. If you prefer not to upload the elf file to GitHub, you can send it directly to zhangyanjiao@espressif.com.
@zhangyanjiaoesp, I have sent on email. please check it.
@andy-danieal That's weird. I didn't get your email either. You can upload your elf file here and share the link here.
And you can set the deletion conditions:
@zhangyanjiaoesp,
@zhangyanjiaoesp,
Have download it.
@andy-danieal Is the elf file the elf when the crash occurred? root or child ?
We have already shared Child log where is crash occurred.
@andy-danieal Please use this wifi lib to test, thanks. wifi_lib_s2_0710.zip
wifi firmware version: fabad8c
@zhangyanjiaoesp, How do I add an existing project to this wifi lib.? And Can you please share the step for us?
@andy-danieal replace the wifi libs in idf/components/esp_wifi/lib/esp32s2
@zhangyanjiaoesp, We have tested the wifi lib, but we are still getting the same issue. and We noticed that the issue exists after the device has been running for 6 to 7 hours.
Also, We have attached the logs we captured using Putty. Unfortunately, we were unable to capture the entire log due to disconnections. We will attempt to recreate the issue and capture the complete log.
@zhangyanjiaoesp, We have tested the wifi lib, but we are still getting the same issue. and We noticed that the issue exists after the device has been running for 6 to 7 hours.
Also, We have attached the logs we captured using Putty. Unfortunately, we were unable to capture the entire log due to disconnections. We will attempt to recreate the issue and capture the complete log.
Ok, waiting for your logs, I have added some debug logs in the wifi lib, maybe it can help us find the root cause.
@zhangyanjiaoesp,
We attached a log file. and this issue was found in Root Node. gw-19-7.txt
@andy-danieal Are you using the wifi lib I provided? Can you open the wifi information log? I can't get any useful information (including the debug information I added) from the log. And you can enable the following option, thus we can get some back trace information when the crash happen.
@zhangyanjiaoesp, Thank you for your support. We have added a root node for a long-term test, and unfortunately, the original issue has not reappeared. However, we did encounter another issue.
Guru Meditation Error: Core 0 panic'ed (StoreProhibited). Exception was unhandled.
Core 0 register dump:
PC : 0x40032a62 PS : 0x00060d33 A0 : 0x80032400 A1 : 0x3ffe14e0
A2 : 0x3fff7ab4 A3 : 0x00000030 A4 : 0x3f03cac2 A5 : 0x00000006
A6 : 0x00000000 A7 : 0x3ffd5a1c A8 : 0x00000001 A9 : 0x3ffd5a1c
A10 : 0x00000002 A11 : 0x01000217 A12 : 0x00000024 A13 : 0x3ffd5a40
A14 : 0x00000001 A15 : 0x4002c0e4 SAR : 0x0000001f EXCCAUSE: 0x0000001d
EXCVADDR: 0x01000223 LBEG : 0x00000024 LEND : 0x3ffd5a40 LCOUNT : 0x40026b5c
Backtrace: 0x40032a5f:0x3ffe14e0 0x400323fd:0x3ffe1500 0x40025a05:0x3ffe1520 0x40025a60:0x3ffe1540 0x40025a95:0x3ffe1560 0x40034905:0x3ffe1580 0x400285c1:0x3ffe15a0 0x400e2d67:0x3ffe15c0 0x400e2dac:0x3ffe15e0 0x400e3872:0x3ffe1600 0x400e8259:0x3ffe17c0 0x400ea0c6:0x3ffe17e0 0x400e8467:0x3ffe1840 0x40037fa1:0x3ffe1860 0x400360a4:0x3ffe1880 0x4002e1c6:0x3ffe18b0
As per Core dump core-dump-decode.txt
==================== CURRENT THREAD STACK =====================
#0 remove_free_block (sl=1, fl=2, block=0x3fff7ab4, control=0x3ffd5a1c) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/tlsf/tlsf.c:332
#1 block_locate_free (size=<optimized out>, control=<optimized out>) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/tlsf/tlsf.c:567
#2 tlsf_malloc (tlsf=0x3ffd5a1c, size=<optimized out>) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/tlsf/tlsf.c:1005
#3 0x40032400 in multi_heap_malloc_impl (heap=0x3ffd5a08, size=48) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/multi_heap.c:210
#4 0x40025a08 in heap_caps_malloc_base (size=48, caps=6144) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/heap_caps.c:179
#5 0x40025a63 in heap_caps_malloc (size=48, caps=6144) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/heap_caps.c:202
#6 0x40025a98 in heap_caps_malloc_default (size=48) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/heap_caps.c:228
#7 0x40034908 in malloc (size=48) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/newlib/heap.c:24
#8 0x400285c4 in wifi_malloc (size=48) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/esp_wifi/esp32s2/esp_adapter.c:65
#9 0x400e2d6a in mesh_malloc ()
#10 0x400e2daf in esp_mesh_create_context ()
#11 0x400e3875 in esp_mesh_wifi_recv_cb ()
#12 0x400e825c in hostap_deliver_data ()
#13 0x400ea0c9 in hostap_input ()
#14 0x400e846a in ap_rx_cb ()
#15 0x40037fa4 in ppRxPkt ()
#16 0x400360a7 in ppTask ()
#17 0x4002e1c9 in vPortTaskWrapper (pxCode=0x40035fb4 <ppTask>, pvParameters=0x0) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:134
@andy-danieal
please set the Heap memory debugging
like the following and test again, thanks.
Thanks for reporting, will close due to short of feedback, feel free to reopen with more updates. Thanks for using our Espressif product!
@zhangyanjiaoesp, Having tested a fixed root node that hung after a 10-20 day interval, it has worked fine after being powered off. The same issue occurred on a child device when the root node hung; some devices were affected, but not all.
Attached root UART dump and log file. log.txt log-uart-dump.txt
Any Update?
@andy-danieal The current crash issue appears to be different from the previous one. Are you using the version where the debug logs were added last time?
@andy-danieal Please use this wifi lib to test, thanks. wifi_lib_s2_0710.zip
wifi firmware version: fabad8c
@zhangyanjiaoesp, Both cases threw an exception on a Memory protection fault. After that, the device continuously reset due to a Cached memory region exception. We couldn't recover the device without a power supply, which was a terrible situation for the client side.
We can't update that library on the client side, but we will demonstrate the setup and wait to reproduce the issue.
Also, We need recovery method cached memory region exception without power supply. and root cause of that error?
@andy-danieal
Although each crash is caused by accessing an illegal address, the disassembly paths differ each time. This is likely due to memory corruption, which makes it difficult to identify the root cause without a reliable reproduction method. The Cached memory region exception
Issue arises from the first crash, so we need to pinpoint the root cause of the first crash. Could you provide a stable reproduction method?
Answers checklist.
General issue report
I encountered an issue that was not resolved after RTC_SW_CPU_RST. The only thing that worked was a hard reset, which meant powering off the device.
ESP-Cache-Issue.txt