bluespec / Flute

RISC-V CPU, simple 5-stage in-order pipeline, for low-end applications needing MMUs and some performance
Apache License 2.0
356 stars 56 forks source link

Provide an optional AXI4-Lite slave for coherent DMA #25

Closed jrtc27 closed 4 years ago

jrtc27 commented 4 years ago

This introduces an AXI4-Lite slave interface at the top level that feeds its requests through the same L1 D-cache as the CPU, which will be used to provide coherent DMA in the Connectal-based AWS F1 SoC variant.

These incoming requests are fulfilled by putting them through a new arbiter that sits between the core and the cache. This arbiter is only present when the AXI4-Lite slave is enabled, but still adds no additional latency to requests except under contention, with all ISA tests taking the same number of cycles in simulation regardless of whether the arbiter is present.

This also adds a small fix to CreditCounter to ensure we never overflow the counter, an issue which long DMA bursts would be more likely to trigger.

rsnikhil commented 4 years ago

Note for future: in addition to direct coherent access from a host, this feature could also be used for Debug Module's 'System Bus Access', eliminating the 'flush' that is currently done on entry/exit from debug mode for coherence reasons. Further, the Debug Module currently uses a 2x3 AXI4 interconnect inside the core to access memory, PLIC, and Near_Mem_IO (CLINT). That could become a 1x3 interconnect if the Debug Module no longer uses it.

jrtc27 commented 4 years ago

You will still need to use the FENCE.I mechanism, but yes.