lowRISC / ibex

Ibex is a small 32 bit RISC-V CPU core, previously known as zero-riscy.
https://www.lowrisc.org
Apache License 2.0
1.36k stars 536 forks source link

Verifying instruction memory faults with the ICache #1451

Open GregAC opened 3 years ago

GregAC commented 3 years ago

Our current DV randomly produces errors on instruction fetches. The trap handler will just retry the failing the instruction when it returns so an eventual successful fetch is required for forward progress.

The co-simulation system assumes that when an instruction fetch has seen an error at the ibex_core boundary the next time we see a PC that uses that address it will see an instruction fault (unless there's been another successful fetch in the mean time). The instruction cache breaks this assumption.

The instruction cache prefetches lines without checking if they're first in the cache. This means an address X can be cached, then fetched again and if we see an error accessing X the second time, X remains cached (the icache just discards the error response), so the next instruction using X happily executes without seeing a fetch error.

This isn't unreasonable behaviour from the icache as we don't make any coherency guarantees, nor do I think we need to here.

We could model the icache to help correctly predict when iside errors turns into actual instruction fetch errors but I don't think that's good idea. We could also probe the icache -> IF stage interface to see when errors appear there. This fails to catch cases when the icache has totally failed to pass on an error but at least checks the processor pipeline is correctly dealing with them when the icache is working. As the icache has separate verification, this isn't unreasonable.

We can also test using purely static instruction fetch errors. Where a particular address range also produces an error. This will require some RISC-DV changes.

For now I think I'll alter the co-simulation system to probe for instruction fetch errors on the icache -> IF stage interface (maybe a configurable option so verification with prefetch buffer remains the same) and we'll add the static instruction fetch errors as a new feature at a later point.

@tomroberts-lowrisc can you confirm I've correctly described the icache behaviour here, any thoughts on the verification strategy (@rswarbrick too)?

tomeroberts commented 3 years ago

Yep that looks correct, and kind of what I was getting at with the comment here

When I've done this before, we did the static approach (a particular address will always result in an error within some bounds) but I don't know how difficult that would be in riscv-dv. The proposed approach sounds sensible to me as a starting point.

GregAC commented 2 years ago

Probing for faults at the icache -> if stage interface has been added to the co-simulation system. I will leave this issue open to track the need for RISC-DV changes so we can do some testing of static instruction faults as well.