Open mysticalzero opened 1 year ago
I'm looking at the issue with the JTAG and the DebugServerConsole program. I got gdb-multiarch
up and was able to connect to the debug server. The crash happens during start-up and by the time I connect to the board using gdb, it's already in a crashed state:
(gdb) bt
#0 exception_handler_default (cause=<optimized out>, val=<optimized out>, regs=0x50232c20) at /home/ubuntu/bl808/M1s_BL808_SDK/components/platform/soc/bl808/bl808/evb/src/interrupt.c:65
#1 0x00000000501037ee in trap_c (cause=7, regs=0x50232c20) at /home/ubuntu/bl808/M1s_BL808_SDK/components/platform/soc/bl808/bl808/evb/src/interrupt.c:120
#2 0x0000000050100620 in exception_common () at /home/ubuntu/bl808/M1s_BL808_SDK/components/platform/soc/bl808/bl808/evb/src/boot/gcc/vectors.S:640
(gdb) info thread
Id Target Id Frame
* 1 Thread 1 (CPU#0) exception_handler_default (cause=<optimized out>, val=<optimized out>, regs=0x50232c20) at /home/ubuntu/bl808/M1s_BL808_SDK/components/platform/soc/bl808/bl808/evb/src/interrupt.c:65
So, what I did was put a breakpoint in bfl_main()
[from M1s_BL808_SDK/components/sipeed/c906/m1s_start/src/start_main.c
] in gdb
and then access the console corresponding to the e907
before doing a halt_cpu0
followed by release_cpu0
. Looking at the console for the c906
, I can see that the c906
restarted and crashed as before but didn't trigger any breakpoints which I set. Is that because the half_cpu0
and release_cpu0
reset the hardware breakpoints? I tried with numerous other breakpoints which are sure to hit based on my understanding of the SDK code but to no avail.
What is the best way to debug this? I was trying to find the relevant documentation but couldn't seem to find any. If anyone has any comments on how best to proceed from here, that would be greatly appreciated.
I'm seeing this exact issue too.
@taorye This seems to happen in xram_ring_write, the memcpy call causes the exception when writing to 22022548.
The precompiled .bin seems to write to the same location without issue, so what could be different? Why would a write to the ring buffer when building from source cause a Store/AMO access fault?
It seems this issue is related to a change in newlib from May 2022, which is in recent versions of xuantie-gnu-toolchain.
I was building the most recent toolchain on macOS, and had to revert this commit to avoid the memcpy access fault writing to the XRAM ring buffer:
https://github.com/T-head-Semi/newlib/commit/ec0c0afa59993b3727958964b33753f62c410d39
To reproduce the issue on the m1sdock board (with camera module and lcd):
./build.sh blai_mnist_demo
mnist.blai
file over tomodels/
on the flash (when connected to the OTG usb port)Un-handled Exception on CPU 2: cause: 7, tval = 22022548, epc = 501027ae
x01 = 50128d04 x02 = 50232e30 x03 = a5a5a5a5a5a5a5a5 x04 = 404040404040404
x05 = 4 x06 = f x07 = 707070707070707 x08 = 4
x09 = 4 x10 = 22022548 x11 = 50232ebc x12 = 0
x13 = 22022548 x14 = 1 x15 = 0 x16 = 50141c90
x17 = 50141c96 x18 = 22022540 x19 = 50232eb8 x20 = 22020000
x21 = 22022548 x22 = 4 x23 = 2323232323232323 x24 = 2424242424242424
x25 = 2525252525252525 x26 = 2626262626262626 x27 = 2727272727272727 x28 = 19
x29 = 50141e5e x30 = 50141ef0 x31 = 4