philbowles / PangolinMQTT

PangolinMQTT - ArduinoIDE client library for ESP8266, ESP32 and STM32-NUCLEO
Other
71 stars 21 forks source link

Quickstart_P crashes: WDT soft reset with "long" plain text payload #14

Open luebbe opened 4 years ago

luebbe commented 4 years ago

Steps to reproduce:

The last messages are:

[0x3FFF0A56] 0x00000000: 53 61 76 65 20 74 68 65 20 50 61 6E 67 6F 6C 69   Save the Pangoli
[0x3FFF0A66] 0x00000010: 6E 21 00                                          n!.
----> SEND PUBCOMP 4 bytes on wire
TXQ=1 TCP ACK LENGTH=4
TXQ=0 TCP frag ACK LENGTH=4 acksize=69 amtleft=4
<---- FROM WIRE PUBLISH 3FFF083A len=536
<---- FROM WIRE PUBREL 3FFF0CC2 len=536
<---- FROM WIRE PUBREL 3FFF0CC2 len=536
<---- FROM WIRE PUBREL 3FFF0CC2 len=362
<---- FROM WIRE PUBREL 3FFF0CC2 len=536
<---- FROM WIRE PUBCOMP 3FFF0CC2 len=536
<---- FROM WIRE CONNACK 3FFF0CC2 len=536
<---- FROM WIRE CONNACK 3FFF0CC2 len=536
<---- FROM WIRE PUBCOMP 3FFF0CC2 len=536
<---- FROM WIRE PUBREL 3FFF0CC2 len=536
<---- FROM WIRE PUBCOMP 3FFF0CC2 len=536
<---- FROM WIRE PUBREL 3FFF0852 len=365
_handlePublish test id=0 @ QoS0 R=false DUP=0 PL@3FFF274D PLEN=6078
T=9418 Message test qos0 dup=0 retain=0 len=6078 elapsed=9137
Address: 0x3FFF274D len: 0x17BE (6078)
[0x3FFF274D] 0x00000000: 4C 6F 72 65 6D 20 69 70 73 75 6D 20 64 6F 6C 6F   Lorem ipsum dolo
[0x3FFF275D] 0x00000010: 72 20 73 69 74 20 61 6D 65 74 2C 20 63 6F 6E 73   r sit amet, cons
[0x3FFF276D] 0x00000020: 65 74 65 74 75 72 20 73 61 64 69 70 73 63 69 6E   etetur sadipscin

...

[0x3FFF306D] 0x00000920: 67 75 62 65 72 67 72 65 6E 2C 20 6E 6F 20 73 65   gubergren, no se
[0x3FFF307D] 0x00000930: 61 20 74 61 6B 69 6D 61 74 61 20 73 61 6E 63 74   a takimata sanct
[0x3FFF308D] 0x00000940: 75 73 20 65 73 74 20 4C 6F 72 65 6D 20 69 70 7
Soft WDT reset

>>>stack>>>

The decoded stack trace is:

0x4020828d in uart_write at ??:?
0x4020c9f8 in vsnprintf at /home/earle/src/esp-quick-toolchain/repo/newlib/newlib/libc/stdio/vsnprintf.c:42
0x40206774 in HardwareSerial::write(unsigned char const*, unsigned int) at ??:?
0x40206ae8 in Print::printf(char const*, ...) at ??:?
0x40206ae8 in Print::printf(char const*, ...) at ??:?
0x4020828d in uart_write at ??:?
0x40206774 in HardwareSerial::write(unsigned char const*, unsigned int) at ??:?
0x40201177 in PANGO::dumphex(unsigned char*, unsigned int, unsigned char) at ??:?
0x40204322 in onMqttMessage(char const*, unsigned char*, PANGO_PROPS, unsigned int, unsigned int, unsigned int) at ??:?
0x4020918e in std::_Function_handler<void (char const*, unsigned char*, PANGO_PROPS, unsigned int, unsigned int, unsigned int), void (*)(char const*, unsigned char*, PANGO_PROPS, unsigned int, unsigned int, unsigned int)>::_M_invoke(std::_Any_data const&, char const*, unsigned char*, PANGO_PROPS, unsigned int, unsigned int, unsigned int) at ??:?
0x402090a5 in PangolinMQTT::_hpDespatch(mb) at ??:?
0x402010dc in mb::mb(mb const&) at ??:?
0x40203adb in PangolinMQTT::_handlePublish(mb) at ??:?
0x401001d8 in ets_post at ??:?
0x402081c8 in uart_do_write_char at uart.cpp:?
0x40240550 in sleep_reset_analog_rtcreg_8266 at ??:?
0x4020fbd8 in std::string::_Rep::_S_create(unsigned int, unsigned int, std::allocator<char> const&) at /workdir/arena.x86_64/gcc/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.tcc:621
0x401009ee in malloc at ??:?
0x4021edbf in ieee80211_output_pbuf at ??:?
0x40207260 in operator new(unsigned int) at ??:?
0x4020fbd8 in std::string::_Rep::_S_create(unsigned int, unsigned int, std::allocator<char> const&) at /workdir/arena.x86_64/gcc/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.tcc:621
0x4020fca0 in std::string::_M_mutate(unsigned int, unsigned int, unsigned int) at /workdir/arena.x86_64/gcc/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.h:299 (discriminator 1)
 (inlined by) std::string::_M_mutate(unsigned int, unsigned int, unsigned int) at /workdir/arena.x86_64/gcc/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.tcc:490 (discriminator 1)
0x4020fe98 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) at /workdir/arena.x86_64/gcc/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.tcc:176
0x402010dc in mb::mb(mb const&) at ??:?
0x40203f3f in PangolinMQTT::_handlePacket(mb) at ??:?
0x40204d5f in mb::mb(unsigned char*, bool) at ??:?
0x40204cf9 in mb::mb(unsigned char*, bool) at ??:?
0x402040a7 in PangolinMQTT::_packetReassembler(mb) at ??:?
0x401025af in wDev_ProcessFiq at ??:?
0x40219a00 in ip4_output_if_opt at /local/users/gauchard/arduino/arduino_esp8266/esp8266-lwip/tools/sdk/lwip2/builder/lwip2-src/src/core/ipv4/ip4.c:1577
0x402041d7 in PangolinMQTT::_onData(unsigned char*, unsigned int) at ??:?
0x40218e00 in ip4_input at /local/users/gauchard/arduino/arduino_esp8266/esp8266-lwip/tools/sdk/lwip2/builder/lwip2-src/src/core/ipv4/ip4.c:1214
0x40204204 in std::_Function_handler<void (void*, AsyncClient*, void*, unsigned int), PangolinMQTT::connect()::{lambda(void*, AsyncClient*, void*, unsigned int)#5}>::_M_invoke(std::_Any_data const&, void*, AsyncClient*, void*, unsigned int) at PangolinMQTT.cpp:?
0x40205f6e in AsyncClient::_recv(std::shared_ptr<ACErrorTracker>&, tcp_pcb*, pbuf*, long) at ??:?
0x402159a6 in tcp_output at /local/users/gauchard/arduino/arduino_esp8266/esp8266-lwip/tools/sdk/lwip2/builder/lwip2-src/src/core/tcp_out.c:1319
0x40206226 in AsyncClient::_s_recv(void*, tcp_pcb*, pbuf*, long) at ??:?
0x40213cd0 in tcp_input at /local/users/gauchard/arduino/arduino_esp8266/esp8266-lwip/tools/sdk/lwip2/builder/lwip2-src/src/core/tcp_in.c:501 (discriminator 1)
0x401009ee in malloc at ??:?
0x40219756 in ip4_input at /local/users/gauchard/arduino/arduino_esp8266/esp8266-lwip/tools/sdk/lwip2/builder/lwip2-src/src/core/ipv4/ip4.c:1467
0x401009b7 in free at ??:?
0x4022b988 in ppRecycleRxPkt at ??:?
0x4021093d in ethernet_input_LWIP2 at /local/users/gauchard/arduino/arduino_esp8266/esp8266-lwip/tools/sdk/lwip2/builder/lwip2-src/src/netif/ethernet.c:188
0x4021075c in esp2glue_ethernet_input at /local/users/gauchard/arduino/arduino_esp8266/esp8266-lwip/tools/sdk/lwip2/builder/glue-lwip/lwip-git.c:469
0x40235e2e in ethernet_input at /local/users/gauchard/arduino/arduino_esp8266/esp8266-lwip/tools/sdk/lwip2/builder/glue-esp/lwip-esp.c:365
0x40235e3f in ethernet_input at /local/users/gauchard/arduino/arduino_esp8266/esp8266-lwip/tools/sdk/lwip2/builder/glue-esp/lwip-esp.c:373
0x4022b486 in ppPeocessRxPktHdr at ??:?
0x40231647 in ets_snprintf at ??:?
0x4010483d in call_user_start_local at ??:?
0x40104843 in call_user_start_local at ??:?
0x4010000d in call_user_start at ??:?
0x40240000 in sleep_reset_analog_rtcreg_8266 at ??:?
0x4022c70f in pp_attach at ??:?
0x4022c75e in pp_attach at ??:?
0x401015df in ppCalFrameTimes at ??:?
0x4022383c in cnx_handshake_timeout at ??:?
0x4022b81b in ppTxPkt at ??:?
0x4021f7eb in ieee80211_send_probereq at ??:?
0x4022162b in scan_remove_probe_ssid at ??:?
0x402211fc in scan_start at ??:?
0x4022385c in cnx_start_handoff_cb at ??:?
0x40105121 in ets_timer_arm_new at ??:?
0x401001d8 in ets_post at ??:?
0x4022385c in cnx_start_handoff_cb at ??:?
0x401014a0 in pp_post at ??:?
0x40104757 in lmacRxDone at ??:?
0x401001d8 in ets_post at ??:?
0x4010203b in rcReachRetryLimit at ??:?
0x401001d8 in ets_post at ??:?
0x4010221c in rcReachRetryLimit at ??:?
0x401001d8 in ets_post at ??:?
0x401026de in wDev_ProcessFiq at ??:?
0x401026de in wDev_ProcessFiq at ??:?
0x40102400 in wDev_ProcessFiq at ??:?
0x401001d8 in ets_post at ??:?
0x40208a35 in run_scheduled_recurrent_functions() at ??:?
0x402074c8 in loop_end at ??:?
0x40206768 in HardwareSerial::write(unsigned char const*, unsigned int) at ??:?
0x40206774 in HardwareSerial::write(unsigned char const*, unsigned int) at ??:?
0x40206700 in EspClass::getChipId() at ??:?
0x4020426e in connectToWifi() at ??:?
0x40204818 in setup at ??:?
0x401001d8 in ets_post at ??:?
0x401001f9 in esp_schedule at ??:?
0x40207501 in loop_wrapper() at core_esp8266_main.cpp:?

Probably it's just the PANGO::dumphex(payload,len,16); call needing too much time, because when I comment it out, the WDT reset doesn't occur.

philbowles commented 4 years ago

There are two places where wdtFeed is used to avoid this problem. Your analysis is probably correct in that dumphex takes a long time to run and blocks WiFi causing a WDT reset - the "classic ESP8266 problem". It is precisley why I took the exceptional step of "feeding the watchdog"

I thought I had solved this with the wdtFeed, obviously I will need to look into it again. Thanks for your diligence and support.