Closed piotro888 closed 5 months ago
The cause of the problem is Forwarder
inside FetchResume
Unifier
key. It can illegally store resume call to next fetcher stall.
I'm still not sure how it happened that call with resume_from_exception=0
was stored in forwarder instead of resume_from_exception=1
that should be always be later (called from Retirement
), and what is the best way to fix it.
EDIT: Unifer key collects only entries form FUs (with resume from exception=0) and Retirement calls fetch directly, so stored value makes sense, but still Retirement call coming first to block the forwader is weird
I am not sure if I see where exactly the problem is. I looked at the execution of TestCoreInterrupt_0_interrupt_asm
and it seems that the core works fine, at least from the perspective of the instruction stream it is given. I didn't dig into what it exactly executes and the test itself
Can you provide cycle at which there is a bug? And what is the expected behavior there?
@xThaid Ok, I found it. The problem:
resume_from_exception=0
that normally should be ignored, but Fetch is still doing something and resume
method is not yet ready, so this call waits inside Unifier
in Forwarder
.Retirement
tries to call resume
with resume_from_excpetion=1
, but the method blocks because fetcher is still not readyresume
call from Retirement
wins with Forwarder
.resume
method is not ready againForwader
in Unifier
still holds the old call value!!! When resume
becomes ready, fetch is resumed to the old PC.Setting priority between Retirement.CORE_RESUME
and Unifier
, to Unifier.Forwarder
should fix the issue, so all calls with resume_form_excpetion=0
arrive before call from the Retirement (with resume_form_excpetion=1
). I need to find a way how to declare it cleanly
Time ~707200ns
I think that best solution would be to split resume
into resume_from_exception
and resume_from_unsafe
with conflict priority explicitly defined in Fetch
OK, thanks, now I can see it.
I wonder though, why do we even allow for speculatively resuming the fetch unit? If we are precommiting a CSR instruction without side effects, (in my opinion) it must not call fetch resume - this is clearly a side effect.
This fixes the failing tests:
diff --git a/coreblocks/func_blocks/csr/csr.py b/coreblocks/func_blocks/csr/csr.py
index 3e203caf..5191ea37 100644
--- a/coreblocks/func_blocks/csr/csr.py
+++ b/coreblocks/func_blocks/csr/csr.py
@@ -103,6 +103,7 @@ class CSRUnit(FuncBlock, Elaboratable):
done = Signal()
accepted = Signal()
exception = Signal()
+ can_resume = Signal()
precommitting = Signal()
current_result = Signal(self.gen_params.isa.xlen)
@@ -205,7 +206,7 @@ class CSRUnit(FuncBlock, Elaboratable):
@def_method(m, self.get_result, done)
def _():
- m.d.comb += accepted.eq(1)
+ m.d.comb += accepted.eq(can_resume)
m.d.sync += reserved.eq(0)
m.d.sync += instr.valid.eq(0)
m.d.sync += done.eq(0)
@@ -249,6 +250,7 @@ class CSRUnit(FuncBlock, Elaboratable):
@def_method(m, self.precommit)
def _(rob_id: Value, side_fx: Value):
with m.If(instr.rob_id == rob_id):
+ m.d.sync += can_resume.eq(side_fx)
m.d.comb += precommitting.eq(1)
m.d.comb += exe_side_fx.eq(side_fx)
Current setup seems to be broken anyway. csr unit sets the ready signal for method fetch_resume
whenever get_result
is called. However, if the unifier forwarder is not ready (another func unit inserted something there), fetch_resume
won't run and its ready signal will be gone.
edit: it seems that the priv unit also tries to resume the fetch unit even when the instruction was precommited without side effects.
In both cases besides side_fx, it also needs to check if exception is not going to be reported on that instruction, otherwise the same conflict could happen if resume is blocked.
Current setup seems to be broken anyway. csr unit sets the ready signal for method
It would work, because there is at most one unsafe instruction at time in the core (that call resume via Forwarder)
But it would still break when we would introduce interrupt delay enforcing via insertion of extra instruction at end of the pipeline. Unsafe instruction on precommit stage with exception clear, could still conflict with that interrupt. So I think that solution with separate resume
methods would fit better in the terms of cleaner logic later.
Should be finally ready to merge
EDIT: wfi test fail, I will fix it tomorrow, rest should be ready to review
Solid work! Just a small question: why are custom_report_level and custom_report_edge separated as two signals? As far as I understand it, their bits are mutually exclusive anyway.
merged to one signal, thanks!
should be ready to merge now
Closes #599
Implements priv spec core internal interrupt controller operating on CSRs. Details in docstrings.
Depends on #652 , needs rebase to reflect changes and clean-up from it after it is approved.