Closed jbqubit closed 6 years ago
When that happens, can you post the exact runtime.elf file that was flashed into the board, and the corresponding exact error message? @whitequark Can we get memory dumps around the illegal instruction PC?
@whitequark do you need anything that's not here? One of those pastes is the complete UART output for an illegal instruction error, and I dumped the entire build as well.
Just saw this again. Was running sines.py on 38b51282226f9 with SAWG, JESD204b=0.6 and HMC830. runtime.elf.zip
[ 11.219450s] INFO(runtime::session): startup kernel finished
[ 11.224668s] INFO(runtime::session): no connection, starting idle kernel
[ 11.231555s] INFO(runtime::session): no idle kernel found
[ 27.479679s] INFO(runtime::session): new connection from 192.168.1.68:54136
[ 27.522905s] INFO(runtime::kern_hwreq): resetting RTIO
panic at runtime/main.rs:305:14: exception IllegalInsn at PC 0x4003c9a8, EA 0x40153058
backtrace for software version 4.0.dev+1087.g38b51282:
0x40002f58
0x40042760
0x40002cd8
0x400010d0
restarting...
[ 0.000007s] INFO(runtime): ARTIQ runtime starting...
[ 0.003892s] INFO(runtime): software version 4.0.dev+1087.g38b51282
[ 0.010242s] INFO(runtime): gateware version 4.0.dev+1087.g38b51282
[ 0.016622s] INFO(runtime): log level set to INFO by default
[ 0.022327s] INFO(runtime): UART log level set to INFO by default
[ 0.028465s] INFO(board_artiq::serwb): waiting for AMC/RTM serwb bridge to be ready...
Not sure why upon restart it hangs waiting for RTM FPGA. AFAIR that chip's .bit doesn't get erased upon restart.
Just happened again. Was running sines.py and seeing sinusoidal output on scope. After about 2 minutes see panic and output on scope is garbage. Same .elf.
panic at runtime/main.rs:305:14: exception IllegalInsn at PC 0x4003c9c0, EA 0x40152bbc
backtrace for software version 4.0.dev+1087.g38b51282:
0x40002f58
0x40042760
0x40002cd8
0x400010d0
restarting...
hmmm...that's after the HMC7043/HMC830 are correctly configured, so it's unlikely that this is due to those chips.
@gkasprow can we add a PLL locked LED to the FP of Sayma for the next revision?
After about 2 minutes see panic and output on scope is garbage.
Please disable restart-on-panic so that we know if the garbage signal is due to the crash or the restart.
@whitequark Can we get memory dumps around the illegal instruction PC?
I'll add this.
And again... I continue posting as the hex codes are changing.
panic at runtime/main.rs:305:14: exception IllegalInsn at PC 0x4003c9a8, EA 0x40153058
backtrace for software version 4.0.dev+1087.g38b51282:
0x40002f58
0x40042760
0x40002cd8
0x400010d0
restarting...
Roger, I'll disable restart-on-panic.
Same runtime.elf for all those dumps?
Same runtime.elf.
panic at runtime/main.rs:305:14: exception IllegalInsn at PC 0x4003c9a8, EA 0x40153058
backtrace for software version 4.0.dev+1087.g38b51282:
0x40002f58
0x40042760
0x40002cd8
0x400010d0
halting.
use `artiq_coreconfig write -s panic_reset 1` to restart instead
Done. On test crash:
@ 0x40002af4
+0000: 1c000000 0000001c 1860400a a8830328
+0010: 9dc2ffe0 a86e0000 04000377 18a00002
+0020: 0400632a a86e0000 19600000 85c2fff4
+0030: 9c220000 8521fffc 44004800 8441fff8
@ 0x40154fd8
+0000: 40154ffc 00000000 400a00c4 00020000
+0010: 00001126 00000003 00000010 00000000
+0020: 4000009c d92f2400 deaddead 002aaff4
+0030: 00000000 00c0a801 3200f903 5f67ca86
panic at runtime/main.rs:323:13: exception IllegalInsn at PC 0x40002af4, EA 0x40154fd8
backtrace for software version 4.0.dev+1105.g985fd737:
0x400032b0
0x4001073c
0x40002cf8
0x400010d0
0x40002aec
restarting...
@whitequark thanks for adding that. Do you want me to post a new UART trace with the memory dump?
Yes, we need the memory dump, the rest of the crash message, and the corresponding runtime.elf
.
Using latest from master 20180604 with SAWG vivado 2018.1 07d4145a35c739. Meets timing. I've run 25 scripts involving SAWG via Ethernet. No panics.
So you fixed Ethernet?
I think we can close this now.
Sounds good. I've not seen it repeat.
Running 38b51282226f9 built with JESD204B=0.6 with SAWG and HMC830. Running SAWG sines.py example. It runs for several minutes then panics. I've seen this twice. Usually don't see the panic. For some reason hmc7043 hand consistently happens after post-panic restart.