Closed wilberforce closed 4 years ago
Glad you are enjoying our WebThing module. I'll be interested to learn why it is crashing for you. We run it on an ESP8266 on devices at our office for days (and weeks) and it is solid.
To use full gdb you need a JTAG board. That's a major undertaking. Fortunately, there's a small built-in gdb stub in the ESP IDF that is good enough to display stack traces and stack frames. To use that, (counterintuitively) run a release build with mcconfig
:
mcconfig -m -p esp
If there's a native crash you should land in gdb.
(counterintuitively) run a release build with mcconfig:
Thanks - yes - it is odd!
Of course the non debug - this was solid as rock.
I then discovered addr2line
Then changing the folder to:
root@office-pc:/mnt/c/Users/rhys/Projects/moddable/build/tmp/esp32/debug/idf#
addr2line -e xs_esp32.elf 0x4008b1dc:0x3ffbd280 0x4008b377:0x3ffbd2a0 0x4008abcd:0x3ffbd2c0 0x400822de:0x3ffbd2e0 0x40082979:0x3ffbd300 0x4000bec7:0x3ffbd320 0x400f5fb8:0x3ffbd340 0x400f4f6e:0x3ffbd360 0x400f5744:0x3ffbd380 0x400d301b:0x3ffbd3a0 0x400d1fa1:0x3ffbd3d0
C:\Users\rhys\Projects\moddable\build\tmp\esp32\debug\idf\esp32/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/esp32/panic.c:676
C:\Users\rhys\Projects\moddable\build\tmp\esp32\debug\idf\esp32/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/esp32/panic.c:676
C:\Users\rhys\Projects\moddable\build\tmp\esp32\debug\idf\heap/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/heap/multi_heap.c:377
C:\Users\rhys\Projects\moddable\build\tmp\esp32\debug\idf\heap/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/heap/heap_caps.c:132
C:\Users\rhys\Projects\moddable\build\tmp\esp32\debug\idf\newlib/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/newlib/syscalls.c:42
??:0
C:\Users\rhys\source\repos\webthing-led/C:\Users\rhys\Projects\moddable\modules\network\socket\lwip/modSocket.c:1108
C:\Users\rhys\source\repos\webthing-led/C:\Users\rhys\Projects\moddable\modules\network\socket\lwip/modSocket.c:1108
C:\Users\rhys\source\repos\webthing-led/C:\Users\rhys\Projects\moddable\modules\network\socket\lwip/modSocket.c:1108
C:\Users\rhys\Projects\moddable\examples\drivers\onewire/C:\Users\rhys\Projects\moddable\xs\platforms\esp/xsHost.c:1072
C:\Users\rhys\Projects\moddable\build\tmp\esp32\debug\idf\main/C:/Users/rhys/Projects/moddable/build/tmp/esp32/debug/xsProj/main/main.c:200
\lwip/modSocket.c:1108
xsmcSetBoolean(xsResult, xss->suspended);
esp/xsHost.c:1072 last line here:
uint8_t ESP_isReadable() {
size_t s;
uart_get_buffered_data_len(USE_UART, &s);
return s > 0;
}
The stack track does look a bit odd so it might be a read herring?
I thought I'd try a instrumented but no debug build:
mcconfig -i -m -p esp32 ssid=xxx password=1utyu5e
# cc xsHost.o (strings in flash)
C:\Users\rhys\Projects\moddable\xs\platforms\esp\xsHost.c: In function 'espDebugBreak':
C:\Users\rhys\Projects\moddable\xs\platforms\esp\xsHost.c:1167:6: error: 'txMachine {aka struct sxMachine}' has no member named 'DEBUG_LOOP'
the->DEBUG_LOOP = 1;
^
C:\Users\rhys\Projects\moddable\xs\platforms\esp\xsHost.c:1173:6: error: 'txMachine {aka struct sxMachine}' has no member named 'DEBUG_LOOP'
the->DEBUG_LOOP = 0;
^
Looks like `DEBUG_LOOP` needs to be in #ifdef ?
Yes, sure we should fix the instrumented non-debug build. But. I don't think enabling instrumentation on a release build is going to reveal much here.
Your stack trace doesn't reveal much -- addr2line
is useful to a point, but applying it to stack dumps tends to also include a bunch of items that aren't really in the stack frame.
It does look like the heap is likely corrupt. In your test app, did you change any native code or just script?
It does look like the heap is likely corrupt. In your test app, did you change any native code or just script
It is script.
https://gist.github.com/wilberforce/dfbc330386fa46a763ac52172b2f9b8b
It solid running standalone - as soon as the gateway is started, it crashes after about 20 mins.
I just tried your app and got the same result--it works fine standalone, but crashes at some point if the gateway is running. My stack trace is a little more informative:
0x4008df83: invoke_abort at /Users/lprader/esp32/esp-idf/components/esp32/panic.c:649
0x4008dfaf: abort at /Users/lprader/esp32/esp-idf/components/esp32/panic.c:649
0x4008d7cf: multi_heap_assert at /Users/lprader/esp32/esp-idf/components/heap/multi_heap.c:696
0x4008dbd9: multi_heap_free_impl at /Users/lprader/esp32/esp-idf/components/heap/multi_heap.c:696
0x40081fb5: heap_caps_free at /Users/lprader/esp32/esp-idf/components/heap/heap_caps.c:123
0x4008260d: _free_r at /Users/lprader/esp32/esp-idf/components/newlib/syscalls.c:42
0x400efa2a: xs_socket_destructor at /Users/lprader/moddable/modules/network/socket/lwip/modSocket.c:1108
0x400eea46: socketDownUseCount at /Users/lprader/moddable/modules/network/socket/lwip/modSocket.c:1108
0x400ef1f2: socketClearPending at /Users/lprader/moddable/modules/network/socket/lwip/modSocket.c:1108
0x400d2e6b: modMessageService at /Users/lprader/moddable/xs/platforms/esp/xsHost.c:1072
0x400d2051: loop_task at /Users/lprader/moddable/build/tmp/esp32/release/xsProj/main/main.c:200
We'll keep looking at this to see if we can figure out what's going wrong. I've had issues before where the gateway spams the device with HTTP requests (because it polls the device to keep track of its properties) and it ends up looking like a DoS attack. Peter did some work to avoid crashes when that happened, but it may need another look.
@lprader Thanks. lf you scroll to the right of my trace it shows pretty much the same thing - so that's good to know addr2line
is working,...
This sample might be useful as an example as it requires less hardware (e,g, no screen). I could combine with the onewire temp sensor driver I'm working on to work as temperature sensorwith WebThing if you are interested.
Whoops, yes I missed that.
And yes, definitely interested. It would be great to have a WebThing example or two without a screen.
Potentially this could be related to the length of some of the mDns packets. I have two apple TVs on the network as well as apple phones - these devices push out a heap of mDNS packets as some are quite long.
The trace @lprader posted suggests the problem is related to disposing TCP sockets. I found an earlier fix had been lost, which led to lwip sockets disconnected by the remote host to be double disposed. That fix has been put back, along with a few other changes. That appears to eliminate the crash @lprader noted above.
Excellent! - 'll try this once I've got the onewire stuff sorted out.
@phoddie
After updating to the new 3.2.2 (?) idf, I'm getting the same type of crash again.
root@office-pc:/mnt/c/Users/rhys/Projects/moddable/build/tmp/esp32/m5stick_c/debug/idf# addr2line -e xs_esp32.elf 0x4008c7a0:0x3ffbd060 0x4008c9a9:0x3ffbd080 0x4008c0a3:0x3ffbd0a0 0x400822c4:0x3ffbd0c0 0x400822f9:0x3ffbd0e0 0x40083095:0x3ffbd100 0x4000beaf:0x3ffbd120 0x40165afc:0x3ffbd140 0x4016580f:0x3ffbd160 0x4016587d:0x3ffbd180 0x4015d3ef:0x3ffbd1a0 0x400fc8ac:0x3ffbd1c0 0x400fe677:0x3ffbd1e0 0x400ea2e2:0x3ffbd210 0x400d4cfd:0x3ffbd270 0x400d4e93:0x3ffbd290 0x400fce5e:0x3ffbd2b0 0x400fd7f2:0x3ffbd350 0x400d2b77:0x3ffbd370 0x400d1be9:0x3ffbd3a0 0x4008a772:0x3ffbd3c0
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\esp32/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/esp32/panic.c:707
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\esp32/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/esp32/panic.c:707
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\heap/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/heap/multi_heap.c:380
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\heap/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/heap/heap_caps.c:131
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\heap/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/heap/heap_caps.c:131
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\newlib/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/newlib/syscalls.c:37
??:0
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\lwip/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/lwip/lwip/src/core/mem.c:124
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\lwip/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/lwip/lwip/src/core/memp.c:231
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\lwip/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/lwip/lwip/src/core/memp.c:231
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\lwip/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/lwip/lwip/src/api/tcpip.c:483
C:\Users\rhys\source\repos\www-m5-mash/C:\Users\rhys\Projects\moddable\modules\network\socket\lwip/modLwipSafe.c:86
C:\Users\rhys\source\repos\www-m5-mash/C:\Users\rhys\Projects\moddable\modules\network\socket\lwip/modSocket.c:798
C:\Users\rhys\Projects\moddable\examples\drivers\m5stickc-pedometer/C:\Users\rhys\Projects\moddable\xs\sources/xsRun.c:734 (discriminator 2)
C:\Users\rhys\Projects\moddable\examples\drivers\m5stickc-pedometer/C:\Users\rhys\Projects\moddable\xs\sources/xsAPI.c:428
C:\Users\rhys\Projects\moddable\examples\drivers\m5stickc-pedometer/C:\Users\rhys\Projects\moddable\xs\sources/xsAPI.c:428
C:\Users\rhys\source\repos\www-m5-mash/C:\Users\rhys\Projects\moddable\modules\network\socket\lwip/modSocket.c:798
C:\Users\rhys\source\repos\www-m5-mash/C:\Users\rhys\Projects\moddable\modules\network\socket\lwip/modSocket.c:798
C:\Users\rhys\Projects\moddable\examples\drivers\m5stickc-pedometer/C:\Users\rhys\Projects\moddable\xs\platforms\esp/xsHost.c:1078
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\main/C:/Users/rhys/Projects/moddable/build/tmp/esp32/m5stick_c/debug/xsProj/main/main.c:200
C:\Users\rhys\Projects\moddable\build\tmp\esp32\m5stick_c\debug\idf\freertos/C:/Users/rhys/msys32/home/rhys/esp/esp-idf/components/freertos/port.c:403
Any ideas? Thanks
That's no good. I'm not sure it is the same crash -- the previous one was closing the socket, but this appears to be when freeing an lwip buffer.
I'm happy to take a look to try to sort out what is going on. It looks like you are running a modified version of the m5stickc-pedometer
example, since the original doesn't have any networking. How can I reproduce this crash?
Hi - I've taken the Webthings sample and got it working without a display on the ESP32. I added a led, and using the built in switch and got it working with Moz IOT gateway. This stuff is very cool - thanks for writing the library.
I have set it up so it toggles on the physical button as well as from the gateway.
However in the ming32 window I get a dump after a while:
How can a launch so that a gdb session can debug where the crash is occuring?
Thanks!