Open a2800276 opened 2 years ago
thanks for the workaround, problem still appears to be there. out of curiosity, what led to your statement "It seems that part of the problem concerns with the Python Interpreter changeing the baud rate... maybe" - I'm not sure how to process the above core panics. Are there symbols somewhere or should one rebuild the firmware on ones own to get them?
The core panics were provided for the benefit of developers attempting to fix the bug, you won't need them if you are using the workaround. In order to work with them you would need the identical build of the firmware and the IDF tools provide scripts to generate more informative backtraces from the individual stack addresses.
The baud rate thing was an observation, but this was half a year ago, I no longer know the specifics. Are you working on fixing the bug or just curious?
I was just curious - my main goal was more around using Python this time around. But fixing the core issue around that fault is some yak I could follow for a while if not shave.
PS: I should say I observed similar panics but I didn't compare the instruction pointers to see if they were the same. Since there might have been a couple of badgepython releases since then it might have changed and still be the same bug.
PPS:
Guru Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU0).
Core 0 register dump:
PC : 0x40085175 PS : 0x00050035 A0 : 0x40084b8c A1 : 0x3ffbe970
A2 : 0x3ff40000 A3 : 0x00000001 A4 : 0x800d7fde A5 : 0x4009090c
A6 : 0x00000000 A7 : 0xa6000000 A8 : 0x00000001 A9 : 0x00000011
A10 : 0x00000050 A11 : 0x3ffc3718 A12 : 0x00000010 A13 : 0x00000000
A14 : 0x3ffb28a0 A15 : 0x80000001 SAR : 0x0000001f EXCCAUSE: 0x00000005
EXCVADDR: 0x00000000 LBEG : 0x4000c2e0 LEND : 0x4000c2f6 LCOUNT : 0xffffffff
Core 0 was running in ISR context:
EPC1 : 0x400d3f03 EPC2 : 0x400d7fde EPC3 : 0x00000000 EPC4 : 0x00000000
Backtrace:0x40085172:0x3ffbe9700x40084b89:0x3ffbe990 0x400d7fdb:0x3ffb27e0 0x401d511b:0x3ffb2800
Core 1 register dump:
PC : 0x401cc28e PS : 0x00060635 A0 : 0x800d3ae8 A1 : 0x3ffb3e40
A2 : 0x00000000 A3 : 0x00060023 A4 : 0x00060023 A5 : 0x3ffb02f0
A6 : 0x007befa8 A7 : 0x003fffff A8 : 0x801446ba A9 : 0x3ffb3e10
A10 : 0x00000000 A11 : 0x00000001 A12 : 0x80093e9d A13 : 0x3ffb02e0
A14 : 0x00000003 A15 : 0x00060023 SAR : 0x00000000 EXCCAUSE: 0x00000005
EXCVADDR: 0x00000000 LBEG : 0x00000000 LEND : 0x00000000 LCOUNT : 0x00000000
Backtrace:0x401cc28b:0x3ffb3e400x400d3ae5:0x3ffb3e60 0x40091d48:0x3ffb3e80
And
Core 0 register dump:
PC : 0x40084b36 PS : 0x00050035 A0 : 0x400d7fde A1 : 0x3ffbe990
A2 : 0x840e324c A3 : 0x00058040 A4 : 0x000637ff A5 : 0x3ffbe970
A6 : 0x3ff40000 A7 : 0x3ffbf074 A8 : 0x800d7fde A9 : 0x4009090c
A10 : 0x00000000 A11 : 0xa6000000 A12 : 0x00000000 A13 : 0x0000001c
A14 : 0x00000021 A15 : 0x3ffc3723 SAR : 0x0000001f EXCCAUSE: 0x00000005
EXCVADDR: 0x00000000 LBEG : 0x4000c2e0 LEND : 0x4000c2f6 LCOUNT : 0xffffffff
Core 0 was running in ISR context:
EPC1 : 0x400d3f03 EPC2 : 0x400d7fde EPC3 : 0x00000000 EPC4 : 0x00000000
Backtrace:0x40084b33:0x3ffbe9900x400d7fdb:0x3ffb27e0 0x401d511b:0x3ffb2800
Core 1 register dump:
PC : 0x401cc28e PS : 0x00060635 A0 : 0x800d3ae8 A1 : 0x3ffb3e40
A2 : 0x00000000 A3 : 0x00060023 A4 : 0x00060023 A5 : 0x3ffb02f0
A6 : 0x007befa8 A7 : 0x003fffff A8 : 0x801446ba A9 : 0x3ffb3e10
A10 : 0x00000000 A11 : 0x00000001 A12 : 0x80093e9d A13 : 0x3ffb02e0
A14 : 0x00000003 A15 : 0x00060023 SAR : 0x00000000 EXCCAUSE: 0x00000005
EXCVADDR: 0x00000000 LBEG : 0x00000000 LEND : 0x00000000 LCOUNT : 0x00000000
Backtrace:0x401cc28b:0x3ffb3e400x400d3ae5:0x3ffb3e60 0x40091d48:0x3ffb3e80
So the PC keeps changing (something something wrote at the wrong place in memory I suppose, which is the starting point for some interesting stories), but the backtrace between your diagnostic data and my two dumps is identical : glitch still here and quite simple to reproduce - follow the wiki instruction - start badgepython, open serial console, boom.
Appears to be caused by garbage data being sent from the RP2040 to the ESP32 when a terminal program opens the USB CDC interface.
Although even when receiving garbage data BadgePython should not crash so while th badgepython..zip e trigger is external there is more bughunting to be done here.
Analyzing the backtrace I get when causing the crash using addr2line on a development build gives:
~/.espressif/tools/xtensa-esp32-elf/esp-2021r2-patch5-8.4.0/xtensa-esp32-elf/bin/xtensa-esp32-elf-addr2line -fe build/badgepython.elf 0x400852bf:0x3ffbe980 0x40084c85:0x3ffbe9a0 0x400d7edf:0x3ffb2370 0x401d5ff3:0x3ffb2390
uart_irq_handler
/home/renze/Documents/Badge.team/MCH2022/badgePython/components/micropython/micropython/ports/esp32/uart.c:52
_xt_lowint1
[badgepython..zip](https://github.com/badgeteam/badgePython/files/11242535/badgepython.zip)
/home/renze/Documents/Badge.team/MCH2022/badgePython/esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1114
app_main
/home/renze/Documents/Badge.team/MCH2022/badgePython/main/main.c:95
main_task
/home/renze/Documents/Badge.team/MCH2022/badgePython/esp-idf/components/freertos/port/port_common.c:141
We never saved the elf file for the released BadgePython executable. We should definitely do that next time because now we can't relate the backtraces supplied to us by a2800276 to the source code.
Guru Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU0).
Core 0 register dump:
PC : 0x400852c2 PS : 0x00050035 A0 : 0x40084c88 A1 : 0x3ffbe980
A2 : 0x3ff40000 A3 : 0x3ffbf084 A4 : 0x800d7ee2 A5 : 0x40091426
A6 : 0x00000000 A7 : 0xa6000000 A8 : 0x00000000 A9 : 0x00000011
A10 : 0x0000008e A11 : 0x3ffc37b8 A12 : 0x00000010 A13 : 0x00000000
A14 : 0x3ffb2430 A15 : 0x80000001 SAR : 0x0000001b EXCCAUSE: 0x00000005
EXCVADDR: 0x00000000 LBEG : 0x4000c2e0 LEND : 0x4000c2f6 LCOUNT : 0xffffffff
Core 0 was running in ISR context:
EPC1 : 0x4009626b EPC2 : 0x400d7ee2 EPC3 : 0x00000000 EPC4 : 0x00000000
Backtrace: 0x400852bf:0x3ffbe980 0x40084c85:0x3ffbe9a0 0x400d7edf:0x3ffb2370 0x401d5ff3:0x3ffb2390
Core 1 register dump:
PC : 0x401cce3a PS : 0x00060635 A0 : 0x800d3ad4 A1 : 0x3ffb3860
A2 : 0x00000000 A3 : 0x00060023 A4 : 0x00060023 A5 : 0x3ffafe40
A6 : 0x007befb8 A7 : 0x003fffff A8 : 0x8014424e A9 : 0x3ffb3830
A10 : 0x00000000 A11 : 0x80000001 A12 : 0x80094b39 A13 : 0x3ffb3760
A14 : 0x00000003 A15 : 0x00060023 SAR : 0x00000000 EXCCAUSE: 0x00000005
EXCVADDR: 0x00000000 LBEG : 0x00000000 LEND : 0x00000000 LCOUNT : 0x00000000
Backtrace: 0x401cce37:0x3ffb3860 0x400d3ad1:0x3ffb3880 0x400928b0:0x3ffb38a0
ELF file SHA256: 373683319beaa08b
CPU halted.
Shiny new micropython release this evening at https://github.com/micropython/micropython/releases/tag/v1.20.0
Would that warrant doing a new badgePython build with symbols available on the side or something? I'm down to test it see if the serial plug still panics it.
Connecting to a running micropython instance crashes the interpreter in (unknown) circumstances.tend It seems that part of the problem concerns with the Python Interpreter changeing the baud rate... maybe
Workaround for conference attendants: start the serial console before you start the Python App, you 'll see the reboot messages as Python starts, and for whatever reason, the Interpreter does not crash ...