SmingHub / Sming

Sming - powerful open source framework simplifying the creation of embedded C++ applications.
https://sming.readthedocs.io
GNU Lesser General Public License v3.0
1.47k stars 347 forks source link

OTA Fails because of software watchdog reset #1968

Closed cometurrata closed 4 years ago

cometurrata commented 4 years ago

I had the issue with Sming-3.8 + SDK 2.0.0 I also have the issue with Sming-4.0.0 + SDK 3

I can easily reproduce that but doing many OTA. Seems that it depends of the wifi or something. I can reproduce it on many Boards and on many wifi network.

It can fail really often (sometimes 15 fails / 15 tries). Sometimes never.

Does someone have an idea ?

Thanks for the great job by the way.

0x40102788: spi_flash_write at ??:?
0x40280418: flashmem_write_internal at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/spi_flash/flashmem.c:243
0x40105d22: rcReachRetryLimit at ??:?
0x40105e87: rcReachRetryLimit at ??:?
0x40106376: wDev_ProcessFiq at ??:?
0x4028051c: flashmem_write at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/spi_flash/flashmem.c:243
0x4000050c: ?? ??:0
0x402370b7: esf_buf_alloc at ??:?
0x4000444e: ?? ??:0
0x40000000: ?? ??:0
0x4010246b: flash_gd25q32c_read_status at ??:?
0x4000410f: ?? ??:0
0x40004a3c: ?? ??:0
0x40102700: spi_flash_erase_sector at ??:?
0x40228da9: rboot_write_flash at /home/come/luko/hardware/Sming-4.0.0/Sming/Components/rboot/rboot/appcode/rboot-api.c:108
0x4010515e: pp_post at ??:?
0x40235637: pp_attach at ??:?
0x402672ee: RbootOutputStream::write(unsigned char const*, unsigned int) at /home/come/luko/hardware/Sming-4.0.0/Sming/Wiring/Print.h:39
0x40235686: pp_attach at ??:?
0x40235792: pp_attach at ??:?
0x4010515e: pp_post at ??:?
0x40234743: ppTxPkt at ??:?
0x4020afab: ieee80211_output_pbuf at ??:?
0x4010164d: ets_timer_disarm at ??:?
0x40262fff: HttpClientConnection::onBody(char const*, unsigned int) at /home/come/luko/hardware/Sming-4.0.0/Sming/Wiring/Countable.h:23
0x40281ced: HttpConnection::staticOnBody(http_parser*, char const*, unsigned int) at /home/come/luko/hardware/Sming-4.0.0/Sming/Core/Network/Http/HttpConnection.cpp:157
0x40223e9d: etharp_send_ip at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp-open-lwip/esp-open-lwip/lwip/netif/etharp.c:630
0x40223eb4: etharp_send_ip at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp-open-lwip/esp-open-lwip/lwip/netif/etharp.c:630
0x40273058: http_parser_execute at /home/come/luko/hardware/Sming-4.0.0/Sming/Components/http-parser/http_parser.c:1999 (discriminator 1)
0x40104003: lmacRecycleMPDU at ??:?
0x40224112: etharp_output_to_arp_index at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp-open-lwip/esp-open-lwip/lwip/netif/etharp.c:890
0x40103abf: lmacProcessTxSuccess at ??:?
0x401038b8: lmacProcessTXStartData at ??:?
0x40224441: etharp_output at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp-open-lwip/esp-open-lwip/lwip/netif/etharp.c:995
0x40223c65: ip_output_if_opt at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp-open-lwip/esp-open-lwip/lwip/core/ipv4/ip.c:780
0x4023acf0: chip_v6_unset_chanfreq at ??:?
0x40101a9e: pvPortCalloc at ??:?
0x4023c728: __stdio_exit_needed at ??:?
0x40223c9a: ip_output_if at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp-open-lwip/esp-open-lwip/lwip/core/ipv4/ip.c:631
0x4021f5e6: pbuf_alloc at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp-open-lwip/esp-open-lwip/lwip/core/pbuf.c:909 (discriminator 3)
0x40260498: HttpConnection::onTcpReceive(TcpClient&, char*, int) at /home/come/luko/hardware/Sming-4.0.0/Sming/Core/Network/Http/HttpRequest.h:74
0x40106376: wDev_ProcessFiq at ??:?
0x40281d53: std::_Function_handler<bool (TcpClient&, char*, int), Delegate<bool (TcpClient&, char*, int)>::Delegate<HttpConnection>(bool (HttpConnection::*)(TcpClient&, char*, int), HttpConnection*)::{lambda(TcpClient&, char*, int)#1}>::_M_invoke(std::_Any_data const&, TcpClient&, char*, int) at /opt/esp-open-sdk/xtensa-lx106-elf/xtensa-lx106-elf/include/c++/4.8.5/functional:2058
0x4021f48a: pbuf_free at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp-open-lwip/esp-open-lwip/lwip/core/pbuf.c:909 (discriminator 3)
0x4025f1f3: std::function<bool (TcpClient&, char*, int)>::operator()(TcpClient&, char*, int) const at /opt/esp-open-sdk/xtensa-lx106-elf/xtensa-lx106-elf/include/c++/4.8.5/functional:2471
 (inlined by) TcpClient::onReceive(pbuf*) at /home/come/luko/hardware/Sming-4.0.0/Sming/Core/Network/TcpClient.cpp:110
0x4025eb36: TcpConnection::internalOnReceive(pbuf*, signed char) at /home/come/luko/hardware/Sming-4.0.0/Sming/Core/Network/TcpConnection.cpp:574
0x40235637: pp_attach at ??:?
0x4025ebd8: TcpConnection::initialize(tcp_pcb*)::{lambda(void*, tcp_pcb*, pbuf*, signed char)#2}::_FUN(void*, tcp_pcb*, pbuf*, signed char) at /home/come/luko/hardware/Sming-4.0.0/Sming/Core/Network/TcpConnection.cpp:342
 (inlined by) _FUN at /home/come/luko/hardware/Sming-4.0.0/Sming/Core/Network/TcpConnection.cpp:344
0x40227d78: tcp_input at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp-open-lwip/esp-open-lwip/lwip/core/tcp_in.c:394 (discriminator 1)
0x4010515e: pp_post at ??:?
0x40234743: ppTxPkt at ??:?
0x402238dd: ip_input at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp-open-lwip/esp-open-lwip/lwip/core/ipv4/ip.c:559
0x40224521: ethernet_input at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp-open-lwip/esp-open-lwip/lwip/netif/etharp.c:1379
0x40104931: lmacTxFrame at ??:?
0x40104ceb: ppEnqueueRxq at ??:?
0x402184d4: ets_timer_handler_isr at ??:?
0x40104ab8: ppProcessTxQ at ??:?
0x40104af4: ppProcessTxQ at ??:?
0x4021cccf: ets_snprintf at ??:?
Stack dump:
To decode the stack dump call from command line:
   make decode-stacktrace
and copy & paste the text enclosed in '===='.

================================================================
3ffffb40:  00000100 3fff7d78 3fffc718 0018d300  
3ffffb50:  0000056c 00000004 40102788 0000056c  
3ffffb60:  3fff7d78 00000000 3fff77e4 00000000  
3ffffb70:  3fffc718 3fff7d78 00000000 00000000  
3ffffb80:  0018d22c 40280418 0018d22c 0000056c  
3ffffb90:  00000570 0000018d 3fff77e4 00000030  
3ffffba0:  40228d54 00000030 00000000 ffffffff  
3ffffbb0:  00000002 00000000 00000020 4010515e  
3ffffbc0:  3ffecad2 401049cb 3fff01f0 00000580  
3ffffbd0:  00000001 40103b86 3ffec688 3fff77e4  
3ffffbe0:  0000056c 3fff7d78 0018d22c 4028051c  
3ffffbf0:  000001a1 00000030 00000008 00000002  
3ffffc00:  40103abf 00000022 00000002 00040000  
3ffffc10:  00002200 00000000 3fff6670 00000030  
3ffffc20:  401063b0 00080000 00000010 014e99fb  
3ffffc30:  00000000 4000444e 3fff0560 3fffc278  
3ffffc40:  00000002 4010246b 00000001 60000200  
3ffffc50:  00000000 4000410f 00001001 00000205  
3ffffc60:  3fffc718 40004a3c 0000018d 00000570  
3ffffc70:  3fffc718 40102700 0000018d 000000c8  
3ffffc80:  3ffe9d53 3fff7d68 00000075 3fff77e4  
3ffffc90:  0000018d 0000056c 3fff7d78 40228da9  
3ffffca0:  3ffee6ca 3fff77f0 00000000 00000030  
3ffffcb0:  3ffed350 00000000 40235637 00000001  
3ffffcc0:  3fff77e0 0000056e 3fff77d0 402672ee  
3ffffcd0:  40235686 3fff0240 3fff6670 00000001  
3ffffce0:  40235792 3fff0240 3fff6670 3fff0240  
3ffffcf0:  00000005 00000005 00000008 3fff6da8  
3ffffd00:  3ffecad2 40234743 3fff0240 3fff6a7b  
3ffffd10:  00000000 4020afab 3fff1430 3ffec688  
3ffffd20:  00000000 00000002 00000000 00000000  
3ffffd30:  3ffeec38 0000056e 3fff85e0 40262fff  
3ffffd40:  3fff6a48 00000000 3fff8690 40281ced  
3ffffd50:  40223e9d 40223eb4 3fff6dcf 40273058  
3ffffd60:  3ffec688 40106247 3fff0ce8 3fff219c  
3ffffd70:  00000002 00000003 3fff6a48 40224112  
3ffffd80:  3ffecad5 2c9f0300 4000050c 3fffc278  
3ffffd90:  40106098 3fffc200 00000022 3fff6d88  
3ffffda0:  3fff6d88 3fff6a54 3fff6a48 40224441  
3ffffdb0:  4025ea65 3fff889c 0000056e 00000000  
3ffffdc0:  0000000a 0000000a 0000000a 40223c65  
3ffffdd0:  4023acf0 3fff6df8 0000013d 40101a9e  
3ffffde0:  0e010700 1c031502 39052b04 48074106  
3ffffdf0:  3ffee6ca 4023c728 00000000 3ffee6ca  
3ffffe00:  00000000 00000000 00000000 00000001  
3ffffe10:  0000056e 00000000 3fff1430 00000016  
3ffffe20:  3fff0d94 3ffffed0 00000001 00000002  
3ffffe30:  4020e120 3ffffed0 3fff12dc 0000056e  
3ffffe40:  00000000 00000001 3fff85e0 40260498  
3ffffe50:  3fff8690 66664f5f 00656369 0000104a  
3ffffe60:  3fff8894 e3190000 3fff8870 3fff8664  
3ffffe70:  3fff6328 3fff6328 3fff85e0 40281d53  
3ffffe80:  40101793 086bc420 3fff6688 4025f1f3  
3ffffe90:  00000000 600011f0 3fff8870 3fff8870  
3ffffea0:  00000000 3fff6328 3fff85e0 4025eb36  
3ffffeb0:  00000001 ffffffc6 40235637 00000001  
3ffffec0:  3fff2264 3fff8870 3fff6328 4025ebd8  
3ffffed0:  3fff2264 3fff2260 3fff8894 40227d78  
3ffffee0:  3fff88ac 00000000 00000010 00000010  
3ffffef0:  00000000 0000056e 00000020 4010515e  
3fffff00:  3ffe0000 40234743 3fff6028 3fff49f8  
3fffff10:  3ffee6a2 3fff4a00 3fff6328 402238dd  
3fffff20:  3ffec140 3fff6a48 3fff6a48 00000000  
3fffff30:  00000000 06a20579 40234137 3fff6ac8  
3fffff40:  3fff6328 3fff6a48 3ffee694 40224521  
3fffff50:  00000068 00000001 40104931 3ffec640  
3fffff60:  3fff6028 40104ceb 00000001 00000000  
3fffff70:  402184d4 40104ab8 00000000 00000000  
3fffff80:  40104af4 00000000 00000002 3fff6ac8  
3fffff90:  3fffdc80 00000000 3fff6328 4021cccf  
3fffffa0:  40000f49 3fffdab0 ffffff00 40000f49  

================================================================
mikee47 commented 4 years ago

Can you reproduce the fault running in the debugger? The backtrace is more reliable.

cometurrata commented 4 years ago

I never used the debugger, I will give it a try

cometurrata commented 4 years ago

I got:

Program received signal SIGPWR, Power fail/restart.
0x40257606 in debug_crash_callback (rst_info=rst_info@entry=0x3ffff990, stack=stack@entry=1073740608, stack_end=stack_end@entry=1073741744)
    at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/gdbstub/appcode/gdb_hooks.cpp:98
98              gdbstub_break_internal(DBGFLAG_RESTART);

bt:

#0  0x40257606 in debug_crash_callback (rst_info=rst_info@entry=0x3ffff990, stack=stack@entry=1073740608, stack_end=stack_end@entry=1073741744)
at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/gdbstub/appcode/gdb_hooks.cpp:98
#1  0x40241016 in __wrap_system_restart_local () at /home/come/luko/hardware/Sming-4.0.0/Sming/Arch/Esp8266/Components/esp8266/crash_handler.c:66
#2  0x40104d06 in pp_soft_wdt_feed_local ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
cometurrata commented 4 years ago

Not a Sming Issue I believe it is not a software issue, my LDO is too weak so voltage drop and get the flash to fail.