Open clare7 opened 6 years ago
Dear Clare,
The interface to the DDR on LowRISC is 16-bits (if you are talking about the Nexys4-DDR implementation). The DDR interface IP from Xilinx
certainly supports a wider word but I should imagine this is more appropriate to high-end boards that have a PC style DDR SIM module, rather
than an individual mobile DDR. If you want higher throughput you could experiment with degrees of freedom in the clock converter ratio and the
DLLs that control the DDR, always bearing in mind that the FPU is a limiting factor in the clock rate to the Rocket subsystem.
I'm ashamed to say I don't know how to alter the Chisel/Scala part of the design. Maybe someone else can jump in here.
A comprehensive improvement would separate out the FPU pipeline from the L2-cache line-fill/write-back function which would appear to be
the bottleneck in this case. Undoubtedly the design could benefit from a performance improvement, but widening the DDR interface is only part
of the solution.
On 11/12/17 07:54, clare7 wrote:
Would it be possible to change the bus width size of NastiIO fron 64 bits to 128bits? That would apply to nasti.r.bits.data and nasti.w.bits.data specifically. The purpose is being to perform read and write to DDR in 128bits instead of 64. Any advice?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lowRISC/lowrisc-chip/issues/81, or mute the thread https://github.com/notifications/unsubscribe-auth/AAgF13aC3H_cSy-m21p8n--w-dTtw1INks5s_N-wgaJpZM4Q8_VX.
In theory, yes, you can change the data width. In reality, it is complicated.
Would you please clarify your intention?
Yes I want to have a 128-bit bus between LL2 and to DDR controller and the DDR controller has the support for 128-bit bus.
Right now I am using v0.4
OK, so here is what you might need to do if you really want a 128-bit LLC/DDR interconnect.
src/main/scala/Configs.scala
, you need to change dataBeats
from 8 to 4: case TLKey("L2toMem") =>
TileLinkParameters(
coherencePolicy = new MICoherence(new NullRepresentation(site(NBanks))),
nManagers = 1,
nCachingClients = site(NBanks),
nCachelessClients = 0,
maxClientXacts = site(NAcquireTransactors) + 2,
maxClientsPerPort = site(NAcquireTransactors) + 2,
maxManagerXacts = 1,
dataBits = site(CacheBlockBytes)*8,
dataBeats = 8
)
You might get a better chance if you disable tagged memory for this. Make sure the LLC and the tagged memory (if enabled) do generate the right circuit with this change of parameter. An easy way to know, is to compile it, run some bare-metal tests. Fix bugs if there are any.
Hopefully the generated ROCKET_MEM_DAT_WIDTH
is set to 128 automatically. If not, something is wrong with the Chisel code, you need to fix that as well.
On the Verilog side, I think there are a couple of component assume the memory bus to be 64-bit by default (not really follow the aforementioned macro). You need to fix them as well (pay special attention to the AXI crossbars on the memory path).
Good luck.
I have made necessary changes and the compilation seems alright. But I faced the followed error during testing:
/lowrisc-chip/src/test/cxx/common/dpi_ram_behav.cpp:180: bool Memory32::write(uint32_t, const uint32_t&, const uint32_t&): Assertion
(addr & 0x3) == 0' failed.
`
what are the changes I need to make on the test file in order to accommodate the data width changes?
Did this error jump out at the very beginning of the simulation? It might be a reset error (the behavioural bram model received a random request before the circuit is actually reset).
Can you specify which simulation were you running with this error? Preferably with the full command line history and log file (more information will help me to specify the error).
Yes the error jumpout at the beginning of the simulation. This happen only after I changed the bandwidth
This is what I run on the command line:
TestMem-sim-debug +vcd +vcd_name=test.vcd +load=./mem_ddr.riscv
and then I get the printout from the printf:
memory32 write [4000a720] = 00000000
memory32 write [4000a724] = 00000000
memory32 write [4000a728] = 00000000
memory32 write [4000a72c] = 00000000
memory32 write [4000a730] = 00000000
memory32 write [4000a734] = 00000000
memory32 write [4000a738] = 00000000
memory32 write [4000a73c] = 00000000
memory32 write [4000a740] = 00000000
Glip TCP DPI listening on port 23000 and 23001
Client connected
Write Mem: [Enc]_: 00000093305290736d82b28300007297,
TestMem-sim-debug : /home/mmwong/Documents/lowrisc-chip/src/test/cxx/common/dpi_ram_behav.cpp:180: bool Memory32::write(uint32_t, const uint32_t&, const uint32_t&): Assertion `(addr & 0x3) == 0' failed.
Aborted (core dumped)
I put in some printf
in the dpi_ram_behv.cpp
of which I get to see the series of memory32write
as shown above.
The Write Mem: [Enc]_: 00000093305290736d82b28300007297
was the printf
of the nasti.io.r.bits.data
from the /uncore/src/main/converter.scala
file.
So this does not look like a reset issue any more. Clearly the error occurs after the reset is withdrawn.
If this error occurs only when the bus width is doubled to 128-bit, I am afraid there is something wrong with the design that must be found and fixed. I do not think I can tell you what is wrong as I have not done this before. You probably need to try debug yourself.
Something for your consideration:
Oh, to simulate your design, probably you need to make the behavioural ram model working with the 128-bit setting anyway.
After adding moreprintf
I get the following, I observed that the address involved was
addr[4000000f]
and that is why the offset is nonzero hence violated the assertion checking.
And from the dev.h file
, I get the address for memory as:
#define DEV_MAP__mem__BASE 0x40000000llu
Now I suspect the error is resulted from the elfLoader
in from dpi_ram_beha.cpp
where the full function is written inloadelf.cpp
// load initial memory
void MemoryController::load_mem(const string& filename) {
using namespace std::placeholders;
std::function<void(uint32_t, uint32_t, const uint8_t*)> f =
std::bind(&Memory32::write_block, &mem, _1, _2, _3);
elfLoader loader = elfLoader(f);
loader(filename);
}
and how do i make the behavioural ram model working with 128-bit setting?? any guideline to begin with?
It just so happens that the block ram in the FPGA build is 128 bits.
So it should be very easy to interface this to your new design. It will no longer be
behavioural, but will probably not be the performance limiting factor.
You can launch this simulation in the board specific directory (fpga/board/nexys4ddr).
Just remove the width conversion and alter the block ram generator in Vivado for your new width.
You can also disable peripherals such as DDR/Minions/VGA/PS2 which are not relevant to your purpose.
You can use iSim (part of Vivado) or VCSMX if you have it from your institute or company.
Needless to say the DDR controller, if included in your simulation, needs to have its width settings modified, with the eventual aim of measuring the benefit. This is for the reason that the PHY runs at 200MHz, but there is a 4:1 speed conversion between that and the controller, and a 2:1 conversion between the controller and the L2-cache. Each of those conversions will be a bottleneck for performance on the slow side of the converter.
If you really want to boost performance, you should figure out how to make the L2-cache run at 50MHz, so that it matches the speed of the DDR controller. This would give an equivalent or better gain, but without the implications for real FPGA occupancy that a 128-bit bus implies. This perhaps should wait until experience has been gained of more straightforward tasks.
On 12/13/2017 03:38 PM, clare7 wrote:
After adding more|printf|I get the following, I observed that the address involved was |addr[4000000f]| and that is why the offset is nonzero hence violated the assertion checking. And from the |dev.h file|, I get the address for memory as: |#define DEV_MAPmemBASE 0x40000000llu|
Now I suspect the error is resulted from the |elfLoader| in from |dpi_ram_beha.cpp|where the full function is written in|loadelf.cpp|
|// load initial memory void MemoryController::load_mem(const string& filename) { using namespace std::placeholders; std::function<void(uint32_t, uint32_t, const uint8_t*)> f = std::bind(&Memory32::write_block, &mem, _1, _2, _3); elfLoader loader = elfLoader(f); loader(filename); } |
and how do i make the behavioural ram model working with 128-bit setting?? any guideline to begin with?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lowRISC/lowrisc-chip/issues/81#issuecomment-351428355, or mute the thread https://github.com/notifications/unsubscribe-auth/AAgF1zzPh_YWirys6lR-1KJxkB4OCJkaks5s_-99gaJpZM4Q8_VX.
The address 0x4000000f
actually means the address is indeed unaligned, which means the assertion is indeed catching an error. So the question then is why the behavioural ram receives such a memory access?
You probably need to trace it back to see who exactly initiate such unaligned request. Generating a VCD waveform may help you finding it.
Nothing helpful from the VCD so far.At the moment I could only be sure error occurred right at the point when the first line of the executable file been written to the memory. Seems like the offset somehow is set to F as such the first read is jumped to 0x4000000f instead to be aligned accordingly.
By the way, could it be anything to do with the following line from Config.scala
file?
val memAlign = BigInt(1L << 30)
what does that line do to the memory?
Would it be possible to change the bus width size of NastiIO fron 64 bits to 128bits? That would apply to nasti.r.bits.data and nasti.w.bits.data specifically. The purpose is being to perform read and write to DDR in 128bits instead of 64. Any advice?