[BUG] Cannot access memory

lumapu commented 1 month ago

Describe the bug ESP crashed, coredump was read from the device

Which platform, esp8266 or esp32? ESP32-S3 Do you use TLS or not? no TLS Do you use an IDE (Arduino, Platformio...)? Platformio Which version of the Arduino framework? 6.7.0

Please include any debug output and/or decoded stack trace if applicable.

Stack trace

``` =============================================================== ==================== ESP32 CORE DUMP START ==================== Crashed task handle: 0x3fcf6990, name: '', GDB name: 'process 1070557584' ================== CURRENT THREAD REGISTERS =================== exccause 0x1d (StoreProhibitedCause) excvaddr 0x0 epc1 0x42079209 epc2 0x0 epc3 0x0 epc4 0x0 epc5 0x0 epc6 0x0 eps2 0x0 eps3 0x0 eps4 0x0 eps5 0x0 eps6 0x0 [New process 1070557584] [New process 1070558984] [New process 1070544264] [New process 1070525568] [New process 1070313812] [New process 1070536528] [New process 1070551560] [New process 1070556032] [New process 1070345904] [New process 1070273552] [New process 1070534828] [New process 1070514224] [New process 1070299724] [New process 1070342604] [New process 1070550388] [Current thread is 1 (process 1070557584)] ==================== CURRENT THREAD STACK ===================== ======================== THREADS INFO ========================= pc 0x40377da5 0x40377da5 lbeg 0x40056f5c 1074098012 lend 0x40056f72 1074098034 lcount 0x0 0 sar 0x4 4 ps 0x60d21 396577 threadptr br scompare1 acclo acchi m0 m1 m2 m3 expstate f64r_lo f64r_hi f64s fcr fsr a0 0x8037d104 -2143825660 a1 0x3fc96a90 1070164624 a2 0x3fc96afa 1070164730 a3 0x3fc96b27 1070164775 a4 0xa 10 a5 0x32 50 a6 0x0 0 a7 0x3fc96a36 1070164534 a8 0x0 0 a9 0x1 1 a10 0x3fc96ade 1070164702 a11 0x3fc96ade 1070164702 a12 0xa 10 a13 0x0 0 a14 0x2c973d0 46756816 a15 0xffffff 16777215 Retrying reading threads information... Retrying reading threads information... TCB NAME PRIO C/B STACK USED/FREE ---------- ---------------- -------- ---------------- 0x3fcf6990 Corrupted TCB data 0x3fcf6f08 IDLE01070558352/1070557944 1488/80 0x3fcf3588 1070543072/1070539128 1070544320/80 0x3fceec80 1070524864/1070522480 1070525623/88 0x3fcbb154 1070313136/1070309700 1070313850/76 0x3fcf1750 1070535888/1070535488 1070536569/88 0x3fcf5208 1070562272/1070560408 1070551592/11344 0x3fcf6380 1070555376/1070551920 1070556068/88 0x3fcc2eb0 1070358272/1070354808 1070345941/12936 0x3fcb1410 1070272864/1070266880 1070273605/88 0x3fcf10ac 1070534192/1070533788 1070534861/84 0x3fcec030 1070513024/1070506016 1070514256/88 0x3fcb7a4c 1070350080/1070346672 1070299779/50972 0x3fcc21cc 1070341936/1070326204 1070342659/84 0x3fcf4d74 1070549728/1070545764 1070550434/76 ==================== THREAD 1 (TCB: 0x3fcf6990, name: '') ===================== ==================== THREAD 2 (TCB: 0x3fcf6f08, name: 'IDLE0') ===================== #0 0x40377da5 in panic_abort (details=0x3fc96afa "abort() was called at PC 0x42049d5c on core 0") at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_system/panic.c:408 #1 0x4037d104 in esp_system_abort (details=0x3fc96afa "abort() was called at PC 0x42049d5c on core 0") at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_system/esp_system.c:137 #2 0x40383c10 in abort () at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/newlib/abort.c:46 #3 0x42049d5f in task_wdt_isr (arg=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_system/task_wdt.c:176 #4 0x40379478 in _xt_lowint1 () at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1118 #5 0x420cb6da in cpu_ll_waiti () at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/hal/esp32s3/include/hal/cpu_ll.h:182 #6 esp_pm_impl_waiti () at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_pm/pm_impl.c:853 #7 0x4204a5d0 in esp_vApplicationIdleHook () at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_system/freertos_hooks.c:63 #8 0x4037e70b in prvIdleTask (pvParameters=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/tasks.c:4099 ==================== THREAD 3 (TCB: 0x3fcf3588, name: '') ===================== #0 0x420cb6da in esp_pm_impl_waiti () at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_pm/pm_impl.c:855 #1 0x4204a5d0 in esp_vApplicationIdleHook () at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_system/freertos_hooks.c:63 #2 0x4037e70b in prvIdleTask (pvParameters=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/tasks.c:4099 ==================== THREAD 4 (TCB: 0x3fceec80, name: '') ===================== #0 0x4037e1e4 in vPortEnterCritical (mux=0x3fcb928c) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:578 #1 xQueueSemaphoreTake (xQueue=0x3fcb9240, xTicksToWait=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/queue.c:1563 #2 0x42067448 in sys_arch_sem_wait (sem=0x3fcf3a10, timeout=1000) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/lwip/port/esp32/freertos/sys_arch.c:188 #3 0x42058e55 in lwip_select (maxfdp1=49, readset=0x0, writeset=0x3fcf3340, exceptset=0x0, timeout=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/lwip/lwip/src/api/sockets.c:2153 #4 0x4204ca6c in esp_vfs_select (nfds=49, readfds=0x0, writefds=0x3fcf3340, errorfds=0x0, timeout=0x3fcf3348) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/vfs/vfs.c:1023 #5 0x42027a01 in WiFiClient::write (this=0x3fca6668 , buf=0x3fcf38dc , size=32) at /home/runner/.platformio/packages/framework-arduinoespressif32/libraries/WiFi/src/WiFiClient.cpp:411 #6 0x4202c7d4 in espMqttClientInternals::ClientSync::write (this=0x3fca6664 , buf=0x3fcf38dc , size=32) at .pio/libdeps/opendtufusion-de/espMqttClient/src/Transport/ClientSync.cpp:50 #7 0x4202aba3 in MqttClient::_sendPacket (this=0x3fca5efc ) at .pio/libdeps/opendtufusion-de/espMqttClient/src/MqttClient.cpp:362 #8 0x4202b881 in MqttClient::_checkOutbox (this=0x3fca5efc ) at .pio/libdeps/opendtufusion-de/espMqttClient/src/MqttClient.cpp:346 #9 0x4202bc27 in MqttClient::loop (this=0x3fca5efc ) at .pio/libdeps/opendtufusion-de/espMqttClient/src/MqttClient.cpp:262 #10 0x4202bca0 in MqttClient::_loop (c=0x3fca5efc ) at .pio/libdeps/opendtufusion-de/espMqttClient/src/MqttClient.cpp:326 ==================== THREAD 5 (TCB: 0x3fcbb154, name: '') ===================== #0 0x4037e082 in vPortEnterCritical (mux=0x3fcedfd8) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:578 #1 xQueueReceive (xQueue=0x3fcedf8c, pvBuffer=0x3fceeafc, xTicksToWait=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/queue.c:1400 #2 0x4206753a in sys_arch_mbox_fetch (mbox=, msg=0x3fceeafc, timeout=77) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/lwip/port/esp32/freertos/sys_arch.c:330 #3 0x420591c2 in tcpip_timeouts_mbox_fetch (mbox=, msg=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/lwip/lwip/src/api/tcpip.c:110 #4 tcpip_thread (arg=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/lwip/lwip/src/api/tcpip.c:148 ==================== THREAD 6 (TCB: 0x3fcf1750, name: '') ===================== #0 0x400559e0 in ?? () #1 0x40380631 in vPortClearInterruptMaskFromISR (prev_level=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:571 #2 vPortExitCritical (mux=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/port.c:332 #3 0x4037fd79 in ulTaskGenericNotifyTake (uxIndexToWait=, xClearCountOnExit=1, xTicksToWait=1000) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/tasks.c:5513 #4 0x4204f86b in emac_w5500_task (arg=0x3fcba064) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_eth/src/esp_eth_mac_w5500.c:654 ==================== THREAD 7 (TCB: 0x3fcf5208, name: '') ===================== #0 0x4037e1e2 in vPortEnterCritical (mux=0x3fcf1328) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:578 #1 xQueueSemaphoreTake (xQueue=0x3fcf12dc, xTicksToWait=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/queue.c:1563 #2 0x40379a5f in ipc_task (arg=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_ipc/src/esp_ipc.c:54 ==================== THREAD 8 (TCB: 0x3fcf6380, name: '') ===================== #0 0x4037e082 in vPortEnterCritical (mux=0x3fcf721c) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:578 #1 xQueueReceive (xQueue=0x3fcf71d0, pvBuffer=0x3fcf7cf0, xTicksToWait=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/queue.c:1400 #2 0x420cf7d9 in esp_event_loop_run (event_loop=0x3fcf71a8, ticks_to_run=4294967295) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_event/esp_event.c:566 #3 0x420cf957 in esp_event_loop_run_task (args=0x3fcf71a8) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_event/esp_event.c:115 ==================== THREAD 9 (TCB: 0x3fcc2eb0, name: '') ===================== #0 0x4037e082 in vPortEnterCritical (mux=0x3fcf7110) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:578 #1 xQueueReceive (xQueue=0x3fcf70c4, pvBuffer=0x3fcf61fc, xTicksToWait=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/queue.c:1400 #2 0x42029ea8 in _arduino_event_task (arg=) at /home/runner/.platformio/packages/framework-arduinoespressif32/libraries/WiFi/src/WiFiGeneric.cpp:305 ==================== THREAD 10 (TCB: 0x3fcb1410, name: '') ===================== #0 0x4037e082 in vPortEnterCritical (mux=0x3fcf3a8c) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:578 #1 xQueueReceive (xQueue=0x3fcf3a40, pvBuffer=0x3fcc600c, xTicksToWait=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/queue.c:1400 #2 0x4203553d in _udp_task (pvParameters=) at /home/runner/.platformio/packages/framework-arduinoespressif32/libraries/AsyncUDP/src/AsyncUDP.cpp:132 ==================== THREAD 11 (TCB: 0x3fcf10ac, name: '') ===================== #0 0x4037e082 in vPortEnterCritical (mux=0x3fcaf344) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:578 #1 xQueueReceive (xQueue=0x3fcaf2f8, pvBuffer=0x3fcb1280, xTicksToWait=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/queue.c:1400 #2 0x42056f48 in queue_recv_wrapper (queue=0x3fcaf2f8, item=0x3fcb1280, block_time_tick=4294967295) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_wifi/esp32s3/esp_adapter.c:424 #3 0x420cd364 in ppTask () ==================== THREAD 12 (TCB: 0x3fcec030, name: '') ===================== #0 vPortEnterCritical (mux=0x3fcf0c84) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:578 #1 xQueueSemaphoreTake (xQueue=0x3fcf0c38, xTicksToWait=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/queue.c:1563 #2 0x40379a5f in ipc_task (arg=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_ipc/src/esp_ipc.c:54 ==================== THREAD 13 (TCB: 0x3fcb7a4c, name: '') ===================== #0 0x4037e1e4 in vPortEnterCritical (mux=0x3fcf2160) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:578 #1 xQueueSemaphoreTake (xQueue=0x3fcf2114, xTicksToWait=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/queue.c:1563 #2 0x4202b370 in MqttClient::publish (this=0x3fca5efc , topic=0x3fca6a70 "", qos=0 '\000', retain=false, payload=0x3fca679f "", length=3) at .pio/libdeps/opendtufusion-de/espMqttClient/src/MqttClient.cpp:151 #3 0x4202b46c in MqttClient::publish (this=0x3fca5efc , topic=0x3fca6a70 "", qos=0 '\000', retain=false, payload=0x3fca679f "") at .pio/libdeps/opendtufusion-de/espMqttClient/src/MqttClient.cpp:166 #4 0x4200d3df in PubMqtt > >::publish (this=0x3fca5efc , subTopic=, payload=0x3fca679f "", retained=false, addTopic=, qos=0 '\000') at publisher/pubMqtt.h:238 #5 0x4200d5a4 in PubMqtt > >::setup(IApp*, cfgMqtt_t*, char const*, char const*, HmSystem<(unsigned char)32, Inverter >*, unsigned int*, unsigned int*)::{lambda(char const*, char const*, bool, unsigned char)#1}::operator()(char const*, char const*, bool, unsigned char) const (qos=, retained=, payload=, subTopic=, this=) at publisher/pubMqtt.h:79 #6 std::_Function_handler > >::setup(IApp*, cfgMqtt_t*, char const*, char const*, HmSystem<(unsigned char)32, Inverter >*, unsigned int*, unsigned int*)::{lambda(char const*, char const*, bool, unsigned char)#1}>::_M_invoke(std::_Any_data const&, char const*&&, char const*&&, bool&&, unsigned char&&) (__functor=..., __args#0=@0x3fcebd4c: 0x3fca676e "", __args#1=@0x3fcebd48: 0x3fca679f "", __args#2=@0x3fcebd44: false, __args#3=@0x3fcebd40: 0 '\000') at /home/runner/.platformio/packages/toolchain-xtensa-esp32s3/xtensa-esp32s3-elf/include/c++/8.4.0/bits/std_function.h:297 #7 0x4201182e in std::function::operator()(char const*, char const*, bool, unsigned char) const (this=0x3fca6708 , __args#0=, __args#1=, __args#2=, __args#3=) at /home/runner/.platformio/packages/toolchain-xtensa-esp32s3/xtensa-esp32s3-elf/include/c++/8.4.0/bits/std_function.h:687 #8 0x42017cac in PubMqttIvData > >::stateSend (this=0x3fca66f8 ) at /home/runner/.platformio/packages/toolchain-xtensa-esp32s3/xtensa-esp32s3-elf/include/c++/8.4.0/array:234 #9 0x4201b308 in PubMqttIvData > >::loop (this=0x3fca66f8 ) at publisher/pubMqttIvData.h:46 #10 PubMqtt > >::loop (this=0x3fca5efc ) at publisher/pubMqtt.h:123 #11 0x4201b408 in app::loop (this=0x3fc9afb8 ) at app.cpp:145 #12 0x42022e61 in loop () at main.cpp:42 #13 0x4203a1ac in loopTask (pvParameters=) at /home/runner/.platformio/packages/framework-arduinoespressif32/cores/esp32/main.cpp:50 ==================== THREAD 14 (TCB: 0x3fcc21cc, name: '') ===================== #0 0x4037e082 in vPortEnterCritical (mux=0x3fcb8c38) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:578 #1 xQueueReceive (xQueue=0x3fcb8bec, pvBuffer=0x3fcc4030, xTicksToWait=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/queue.c:1400 #2 0x4207038d in _mdns_service_task (pvParameters=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/mdns/mdns.c:4639 ==================== THREAD 15 (TCB: 0x3fcf4d74, name: '') ===================== #0 0x4037e082 in vPortEnterCritical (mux=0x3fcbe0a4) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port/xtensa/include/freertos/portmacro.h:578 #1 xQueueReceive (xQueue=0x3fcbe058, pvBuffer=0x3fcc2044, xTicksToWait=) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/queue.c:1400 #2 0x4202708a in _get_async_event (e=) at .pio/libdeps/opendtufusion-de/AsyncTCP/src/AsyncTCP.cpp:123 #3 _async_service_task (pvParameters=) at .pio/libdeps/opendtufusion-de/AsyncTCP/src/AsyncTCP.cpp:199 ======================= ALL MEMORY REGIONS ======================== Name Address Size Attrs .rtc.text 0x600fe000 0x0 RW .rtc.dummy 0x600fe000 0x0 RW .rtc.force_fast 0x600fe000 0x0 RW .rtc.force_slow 0x50000010 0x0 RW .iram0.vectors 0x40374000 0x403 R XA .iram0.text 0x40374404 0x1138f R XA .dram0.data 0x3fc957a0 0x57d8 RW A .noinit 0x3fc9af78 0x0 RW .flash.text 0x42000020 0xd0ddf R XA .flash.appdesc 0x3c0e0020 0x100 R A .flash.rodata 0x3c0e0120 0x48198 RW A .iram0.text_end 0x40385793 0x0 RW .iram0.bss 0x40385794 0x0 RW .dram0.heap_start 0x3fcae298 0x0 RW .coredump.tasks.data 0x3fcf6990 0x158 RW .coredump.tasks.data 0x3fcf6710 0x260 RW .coredump.tasks.data 0x3fcf6f08 0x158 RW .coredump.tasks.data 0x3fcf6c90 0x260 RW .coredump.tasks.data 0x3fcf3588 0x158 RW .coredump.tasks.data 0x3fcf30e0 0x490 RW .coredump.tasks.data 0x3fceec80 0x158 RW .coredump.tasks.data 0x3fcee9c0 0x2a0 RW .coredump.tasks.data 0x3fcbb154 0x158 RW .coredump.tasks.data 0x3fcbaeb0 0x290 RW .coredump.tasks.data 0x3fcf1750 0x158 RW .coredump.tasks.data 0x3fcf14d0 0x260 RW .coredump.tasks.data 0x3fcf5208 0x158 RW .coredump.tasks.data 0x3fcf7be0 0x2b0 RW .coredump.tasks.data 0x3fcf6380 0x158 RW .coredump.tasks.data 0x3fcf60f0 0x270 RW .coredump.tasks.data 0x3fcc2eb0 0x158 RW .coredump.tasks.data 0x3fcc5f00 0x270 RW .coredump.tasks.data 0x3fcb1410 0x158 RW .coredump.tasks.data 0x3fcb1160 0x290 RW .coredump.tasks.data 0x3fcf10ac 0x158 RW .coredump.tasks.data 0x3fcf0e30 0x260 RW .coredump.tasks.data 0x3fcec030 0x158 RW .coredump.tasks.data 0x3fcebb80 0x490 RW .coredump.tasks.data 0x3fcb7a4c 0x158 RW .coredump.tasks.data 0x3fcc3f00 0x2a0 RW .coredump.tasks.data 0x3fcc21cc 0x158 RW .coredump.tasks.data 0x3fcc1f30 0x280 RW .coredump.tasks.data 0x3fcf4d74 0x158 RW .coredump.tasks.data 0x3fcf4ae0 0x280 RW ===================== ESP32 CORE DUMP END ===================== =============================================================== ```

Expected behaviour no crash

To Reproduce not that easy - don't know how to do it.

Additional context Can you determine where to search? Does it happen in the MqTT library or in my code? For me it feels that the issue happens while the library publishes the internal queue.

bertmelis commented 1 month ago

Low memory issue I suspect. Perhaps drastically increase the minimum free memory requirement (https://github.com/bertmelis/espMqttClient/blob/main/src/Config.h#L32) or use the memory pool feature (https://github.com/bertmelis/espMqttClient/blob/main/src/Config.h#L65).

Your application is able to handle failure to publish? (publish returns zero)

PS Arduino is currently at v 3.0.5 so I don't know what you mean with 6.0.7

bertmelis commented 1 month ago

Still the same problem as https://github.com/bertmelis/espMqttClient/discussions/164.

bertmelis commented 1 month ago

3 hypothesis:

Espressif's implementation of lwip and the Arduino core fails to handle allocation failures. Maybe this is unlikely
Some issues with PSRAM?
my implementation if wrong (but only shows under heavy load/low memory)

lumapu commented 1 month ago

you're right, same issue as in #164 Will close this one. Thank you for your quick response. This time the ESP crashed really fast after booting, so I can't imagine that the ESP goes out of memory. It feels more that somehow the memory was already freed before sending the data out.

bertmelis commented 1 month ago

Keep this open. The other one is converted to a discussion.

It is most strange. The library allocates the entire packet on heap memory and only releases the memory after it is completely sent. Sending in this case means passed to Arduino's WiFiClient::send. In my understanding, this creates a second copy of this data for the underlying lwip. I don't think it is a memory issue although it appears as one.

I'm searching for a concurrency/deadlock issue.

bertmelis commented 1 month ago

Are you using builtin WiFi or an ethernet adapter (w5500)?

bertmelis commented 1 month ago

Other observations: If you use platformio, you might want to consider upgrading to https://github.com/pioarduino/platform-espressif32 AsyncTCP and non-async is mixed in your code. Is it possible to (as a test) disable the features that use AsyncTCP?

lumapu commented 1 month ago

Are you using builtin WiFi or an ethernet adapter (w5500)?

In this scenario here: yes For the issue in discussion I don't remember, but can try to figure out

AsyncTCP

I'll give it a try. Never thought in this direction.

bertmelis commented 1 month ago

This is going to be trial and error bughunting.

Or somebody needs to have a divine intervention.

lumapu commented 1 month ago

Is it possible to (as a test) disable the features that use AsyncTCP?

All MqTT and AsyncWebserver stuff is based on AsyncTCP. Don't know if it really makes sense to disable it. The system is then not "useful" anymore. The issue does not occur on my system, it happens to one of the users on a really random pattern of days.

bertmelis commented 1 month ago

To rule out memory exhaustion issues you could try with the memory pool enabled. The library then allocates its memory statically on initialization (underlying libraries not taken into account).

If you need some guidance with that, let me know.

lumapu commented 1 month ago

Today I have the same issue again. From that I read a bit about xSemaphoreTake in FreeRTOS. It tells that it is necessary to check the return value even if you set portMAX_DELEY as the timeout. If feel that the issue can may related to that, because MqttClient::loop() as well as MqttClient::publish are capsulated by a semaphore. Do you think it makes sense to check for the mentioned functions of MqttClient the return value to be pdTRUE? I also read some older article (2006) where the solution was to yield() after xSemaphoreGive() to have a context switch.

I don't know too much about that but on the other hand I think these extra conditions will improve.

In the meantime I try to patch the library locally and test it a few days. Hopefully it helps to cover this problem.

https://www.freertos.org/Documentation/02-Kernel/04-API-references/10-Semaphore-and-Mutexes/12-xSemaphoreTake

lumapu commented 1 month ago

quick'n'dirty:

semaphore.patch

```patch diff --git a/src/MqttClient.cpp b/src/MqttClient.cpp index dc21f74..d4b35c4 100644 --- a/src/MqttClient.cpp +++ b/src/MqttClient.cpp @@ -1,7 +1,7 @@ /* Copyright (c) 2022 Bert Melis. All rights reserved. -This work is licensed under the terms of the MIT license. +This work is licensed under the terms of the MIT license. For a copy, see or the LICENSE file. */ @@ -148,16 +148,19 @@ uint16_t MqttClient::publish(const char* topic, uint8_t qos, bool retain, const #endif return 0; } - EMC_SEMAPHORE_TAKE(); - uint16_t packetId = (qos > 0) ? _getNextPacketId() : 1; - if (!_addPacket(packetId, topic, payload, length, qos, retain)) { - emc_log_e("Could not create PUBLISH packet"); + uint16_t packetId = 0; + if(pdTRUE == EMC_SEMAPHORE_TAKE()) { + packetId = (qos > 0) ? _getNextPacketId() : 1; + if (!_addPacket(packetId, topic, payload, length, qos, retain)) { + emc_log_e("Could not create PUBLISH packet"); + EMC_SEMAPHORE_GIVE(); + _onError(packetId, Error::OUT_OF_MEMORY); + if(pdTRUE == EMC_SEMAPHORE_TAKE()) + packetId = 0; + } EMC_SEMAPHORE_GIVE(); - _onError(packetId, Error::OUT_OF_MEMORY); - EMC_SEMAPHORE_TAKE(); - packetId = 0; + yield(); } - EMC_SEMAPHORE_GIVE(); return packetId; } @@ -174,16 +177,19 @@ uint16_t MqttClient::publish(const char* topic, uint8_t qos, bool retain, espMqt #endif return 0; } - EMC_SEMAPHORE_TAKE(); - uint16_t packetId = (qos > 0) ? _getNextPacketId() : 1; - if (!_addPacket(packetId, topic, callback, length, qos, retain)) { - emc_log_e("Could not create PUBLISH packet"); + uint16_t packetId = 0; + if(pdTRUE == EMC_SEMAPHORE_TAKE()) { + packetId = (qos > 0) ? _getNextPacketId() : 1; + if (!_addPacket(packetId, topic, callback, length, qos, retain)) { + emc_log_e("Could not create PUBLISH packet"); + EMC_SEMAPHORE_GIVE(); + _onError(packetId, Error::OUT_OF_MEMORY); + if(pdTRUE == EMC_SEMAPHORE_TAKE()) + packetId = 0; + } EMC_SEMAPHORE_GIVE(); - _onError(packetId, Error::OUT_OF_MEMORY); - EMC_SEMAPHORE_TAKE(); - packetId = 0; + yield(); } - EMC_SEMAPHORE_GIVE(); return packetId; } @@ -237,11 +243,13 @@ void MqttClient::loop() { case State::connectingMqtt: #if EMC_WAIT_FOR_CONNACK if (_transport->connected()) { - EMC_SEMAPHORE_TAKE(); - _sendPacket(); - _checkIncoming(); - _checkPing(); - EMC_SEMAPHORE_GIVE(); + if(pdTRUE == EMC_SEMAPHORE_TAKE()) { + _sendPacket(); + _checkIncoming(); + _checkPing(); + EMC_SEMAPHORE_GIVE(); + yield(); + } } else { _setState(State::disconnectingTcp1); _disconnectReason = DisconnectReason::TCP_DISCONNECTED; ```

lumapu commented 1 month ago

espMqttClientSemaphore.zip

Patch which also is compatible with ESP8266

bertmelis commented 1 month ago

Regarding the return value of xSemaphoreTake: could you provide a link to the explanation? Not that I'm not willing to take it into account. I'm also here to learn.

Every iteration of loop has a yield. Yielding after API-methods like publish are to be done by the user.

Another possibility would be to not use blocking semaphores in the library loop. After all, the operations can be executed in the next iteration whereas publishing might not be able to block.

lumapu commented 1 month ago

sure, here are the links I visited yesterday: https://www.freertos.org/FreeRTOS_Support_Forum_Archive/February_2006/freertos_xSemaphoreTake_fails_before_timeout_1441441.html https://www.freertos.org/Documentation/02-Kernel/04-API-references/10-Semaphore-and-Mutexes/12-xSemaphoreTake

I also asked ChatGPT (in German), this was the important output:

Was bedeutet 0xFFFFFFFF oder portMAX_DELAY?

portMAX_DELAY (0xFFFFFFFF): Wenn du portMAX_DELAY als Timeout verwendest, bedeutet das, dass der Task unendlich lange wartet, bis er den Mutex übernehmen kann. Der Task blockiert also so lange, bis der Mutex frei wird, und er wird erst dann den kritischen Abschnitt betreten.

Warum weiterhin die Überprüfung notwendig ist?

Selbst bei einem unendlichen Timeout (portMAX_DELAY) kann es unter bestimmten Umständen vorkommen, dass xSemaphoreTake() nicht erfolgreich ist. Zum Beispiel:

Fehler in der Semaphore-Initialisierung: Wenn der Mutex oder die Semaphore selbst nicht korrekt initialisiert wurde, könnte xSemaphoreTake() fehlschlagen.

Systemunterbrechungen oder Exceptions: Es gibt Szenarien, in denen ein Task durch Systemunterbrechungen, Speicherprobleme oder andere Systemfehler daran gehindert wird, den Mutex zu übernehmen, auch wenn er theoretisch unendlich wartet. In diesem Fall würde xSemaphoreTake() ebenfalls pdFALSE zurückgeben.

Priority Inversion oder Deadlocks: Selbst wenn der Task unendlich wartet, könnte es in komplexen Systemen zu Deadlocks oder zu einer Priority Inversion kommen, die das erfolgreiche Übernehmen des Mutex verhindert.

bertmelis commented 3 weeks ago

Question: do you use tasks other than the Arduino task itself in your application? Which of the tasks use MQTT?

You might want to disable the separate MQTT task and just call loop() from your code so you will have less to worry about concurrency.

bertmelis / espMqttClient

[BUG] Cannot access memory #166

Was bedeutet 0xFFFFFFFF oder portMAX_DELAY?

Warum weiterhin die Überprüfung notwendig ist?