gthparch / macsim

A heterogeneous architecture timing model simulator.
http://comparch.gatech.edu/hparch/macsim.html
137 stars 49 forks source link

DRAM interface for WRITE/READ? #49

Open uwuser opened 4 years ago

uwuser commented 4 years ago

Hello,

I am looking at the provided DRAM interfaces for either DRAMsim as well as the Ramulator and I am not clear. In detail, dram_dramsim.cc line 142 we see: if (m_dramsim->addTransaction(req->m_type == MRT_WB,static_cast(req->m_addr))) .. and also in dram_ramulator.cc line 145 we have: if (req->m_type != MRT_WB) { ...

Here it seems every request generated by the cores are considered either MRT_WB as a WRITE request or not (everything else considered as READ request to DRAM).

Investigating with two different traces from IsolBench (Bandwidth Read and Bandwidth Write), I can confirm that most of the requests of Bandwidth Read application is of type of MRT_DFETCH and most of the requests of Bandwidth Write application is of type of MRT_DSTORE (not MRT_WB). Therefore, simulation with Ramulator/DRAMsim always receives READ requests even the core generates MRT_DSTORE.

Can you perhaps elaborate on this and let me know why all requests are considered either MRT_WB or NOT? I believe MRT_DSTORE should not be considered as READ request!! (or I am wrong?)

Fixing the statement to consider: m_type == MRT_WB || m_type == MRT_DSTORE as WRITE request gives ASSERT FAILED for both Ramulator/DRAMsim.

Any feedback would be appreciated.

hyesoon commented 4 years ago

Even a write request from a core, the write request needs to bring a block from the memory first, so it becomes a DRAM READ. A dirty block eviction from cache becomes write operations.

uwuser commented 4 years ago

Thanks for the explanation and yes, that is correct; however, I do not see any write back during the execution (I think at some point I should see some) but it could be due to the cache configuration or working set size of the benchmark.

uwuser commented 4 years ago

As an update, I am not clear regarding how write back are handled. I don't see any evicted dirty block not from the provided mergesort trace nor from my own traces. For instance, a streaming bandwidth benchmark which only tries to write to memory, will only cause MRT_DSTORE request (which is a READ to DRAM) and no write back is initiated in the simulation to the DRAM (there are small number of WB from L1 to L2 and L2 to LLC).

I have tried with all cache configurations and this issue still applies. The other thing is that it seems bypassing the caches does not have any affect on the performance (IPC, number of DRAM request)! Does it mean that the caches are not being used at all? Sorry but I am really confused.

hyesoon commented 4 years ago

We have debugged the memory system several times in the past and confirmed the write back activities. (but that doesn't mean that it hasn't changed since.) I'll try to take a look at it in a few days. can you share more detailed info of your config and trace file to replicate your exp. and how did you by-pass caches?

uwuser commented 4 years ago

Thank you for getting back on this. Here is the detailed info of my trace and configs:

I really appreciate your time on this.