Closed tampler closed 6 years ago
After peeking and poking over Rocket documentation and Chisel code, I ended up using a TL_UH protocol, which is already implemented in the LazyRocc module. For those, who may be interested in more details on that, pls refer to Issue #1611, where I'll elaborate more on my experience with TileLink
Hi guys
For a 2-core Rocket-based Linux system, I need to implement a non-cached non-paged Scatter-Gather DMA as a part of the ROCC module. My DMA should have a high-throughput low-latency access to a multibank DRAM and a word-aligned access. The max burst size is 8kB in a SG mode, which may be mapped to 1024 beats and split across multiple AXI4 transactions due to the limit of 256 beats for AXI4 protocol.
The ROCC computed result should be shared among multiple Rocket cores. The ROCC output buffer is also max 8kB.
What is the best architectural way to implement such system ? There are gonna be several long latency operations ( 1000s of cycles) and they will suspend an issuing core as per ROCC implementation.