Open xobs opened 4 years ago
ahh sure, that's right, hmm so, when a debugplugin halt happen, the wfi should be waked up. i think that should be enough to fix the issue. I will fix that tonight, then can you test again ?
Yes, that makes sense.
By the way, this was discovered when we were porting Linux to the Hackaday conference badge. For some reason, an atomic instruction was taking two cycles to execute (I.e. stepi halted twice with the same pc), didn't call the trap handler, and was effectively a noop. This led the Kernel startup to think it was cpu core #30000000, and wait to be woken up.
Are atomics implemented, or is there possibly another interaction with the Debug plug in and invalid instructions? Why would it execute the same pc twice?
On 19 November 2019 17:40:31 GMT+08:00, Dolu1990 notifications@github.com wrote:
ahh sure, that's right, hmm so, when a debugplugin halt happen, the wfi should be waked up. i think that should be enough to fix the issue. I will fix that tonight, then can you test again ?
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/SpinalHDL/VexRiscv/issues/98#issuecomment-555418731
weird, what was the cpu configuration ? It happen when you were doing step by step ?
Same configuration as last time. It's using linuxFull()
as the CSR config.
So
, an atomic instruction was taking two cycles to execute
Basicaly, when the d$ do a miss, it refill itself and reschedule the instruction, but in debug that rescheduling isn't possible. So when you do a step on a memory operation which load stuff, you have no guarenties that the step will be successfull. This is expected behaviour, not a bug/issue. So first dummy step behave as a line refill + NOP, second step do the real thing.
If you have the mmu enabled, that can be even worst, MMU miss + cache miss => 2 NOP before a successfull LD
As far i know, when using GDB and doing a step, if the PC didn't move, he redo a step automaticaly.
Anyway, stepbystep on atomic instruction was never tested, the best would probably to run a simulation and capture the wave to figure out what is going wrong.
gdb will read the $pc
in order to know where it is. That way it can do things like trace the interrupt handler when an interrupt occurs. With Vex, it seems like that the stepIt
bit will simply not advance $pc
when it stalls like that. Which makes sense.
Does Vex currently support atomic instructions? Particularly amoadd.w a3, a2, (a3)
? Or is that supposed to be patched and handled by trap_handler
?
Does Vex currently support atomic instructions?
Yes, the D$ support all of them. (https://github.com/xobs/VexRiscv-verilog/blob/master/src/main/scala/vexriscv/GenHaD.scala#L93) what you can do to get a better diagnostic is to reduce the interraction between the debug and the hardware execution by using breakpoint instead of step.
As breakpoints will use the normal pipeline behaviour
FYI -- @pgielda @mwelling @pdp7
I updated the VexRiscv dev branch to wakeup the pipeline when the debug ask to halt the CPU : https://github.com/SpinalHDL/VexRiscv/commit/7ae218704e0e1849395e9bd64f58f34ce391b88d
@xobs Can you give a test ?
Just tested the WFI, that seem all fine now.
Now i'm checking atomic
@xobs
Tested the atomic stuff, i got that double step execution, but as said before, that's to be expected. But the execution (all in step by step) was fine.
_start:
nop
nop
nop
nop
nop
nop
nop
nop
nop
la a3, memA
li a2, 1
amoadd.w a3, a2, (a3)
nop
nop
nop
la a3, memA
li a2, 1
amoadd.w a3, a2, (a3)
nop
nop
nop
nop
memA:
.word 0x10
memB:
.word 0x20
Sorry for the noise -- the other issue was completely unrelated and appears to be a bug in Linux where it assumes memory is cleared before it clears the BSS.
Specifically:
https://github.com/sifive/riscv-linux/blob/master/arch/riscv/kernel/head.S#L41-L45
This uses atomics to increment a value. If the value is 0, then that core considers itself to have "won the lottery" and boots as normal. If it's nonzero, then it assumes that some other core won the lottery and issues a WFI.
But if memory isn't zeroed out prior to running that code, then unless you have ~4 billion cores it's not guaranteed that any core will be 0.
@xobs I'm guessing the clear should be sent upstream to the RISC-V Linux?
Testing by having blockram at address 0x10000000, and loading the following binary. Commands reproduced here so I can remember them in the future -
wfitest.S:
.global _start
_start:
csrw sie, zero
ebreak
li t0, 0
addi t0, t0, 1
addi t0, t0, 1
addi t0, t0, 1
wfi
addi t0, t0, 1
addi t0, t0, 1
addi t0, t0, 1
ebreak
iloop:
j iloop
$ riscv64-unknown-elf-gcc -ggdb3 "-Wl,--section-start=.text=0x10000000" -march=rv32ima -mabi=ilp32 wfitest.S -o wfitest.elf -nostdlib -nostdinc; riscv64-unknown-elf-objcopy -O binary wfitest.elf wfitest.bin
$
Without 7ae218704e0e1849395e9bd64f58f34ce391b88d the wfi
locks up the CPU as described in the bug report. With 7ae218704e0e1849395e9bd64f58f34ce391b88d
it behaves like a nop
which is as expected.
@xobs I'm guessing the clear should be sent upstream to the RISC-V Linux?
I'm not sure how you'd do that. They make an assumption that the memory is zeroed, which is how the whole lottery works. The right way might be to have hart_lottery
placed at a known offset so the bootloader knows to zero it. Or to require that memory is zeroed. Or to use some other method for determining which core you're on.
A bug should definitely be logged, I'm just not sure how to do that, and I haven't had good experience getting things upstream into Linux.
If you're using the DebugPlugin to debug a core, and it issues a
WFI
with no interrupts enabled, then that core will freeze and be unrecoverable with the DebugPlugin.The only solution is to power-cycle the system.