s5z / zsim

A fast and scalable x86-64 multicore simulator
GNU General Public License v2.0
335 stars 185 forks source link

NUCA cache design on ZSim #218

Open jiaotong666 opened 6 years ago

jiaotong666 commented 6 years ago

Can someone tell me how ZSim implements NUCA cache design?

jasonzzzzzzz commented 4 years ago

Intuitively, I guess Zsim has already got a version on the NUCA design, even on a D-NUCA design, which is shown a bit in their tutorial http://zsim.csail.mit.edu/tutorial/slides/memory.pdf.

Besides the challenge of implementing D-NUCA, I would also like to propose a potential implementation from the network-NUCA co-consideration perspective.

Simply put, in my understanding, it is not that easy in Zsim to integrate memory controllers, L3 slices, and other components to share the same physical network. This results from the abstraction that Zsim relies on, with both pros and cons.

Zsim relies on a code structure where CC class is abstracted to be a Top Interface and a Bottom Interface (in coherence_ctrls.h), handling invalidations and accesses, respectively. This structure makes Zsim be easily abstracted and scalable "vertically", e.g. adding and removing a cache level in the memory hierarchy (like adding an L4 cache as a mem buffer or a victim cache to model Skylake or Haswell). Further, the interconnect between each of the two cache levels is feasible to inherit a top-bottom interface structure to connect the adjacent levels together, and model contention in some way. However, it might not be easy for this abstraction to implement "horizontal" communication.

"horizontal" communication indicates that L2<->L3 slice packets and L3 slice<->memory controller (MC) packets should all share the same network to model all network contention faithfully, instead of using a vertical structure where L3<->MC uses an upper level interconnect while L2<->L3 uses another lower level interconnect. If putting MCs to be children of L3, it violates the semantic in Zsim, in which accesses only go from lower level to upper level, and invalidations only go from upper level to lower level. Even we still remain the prioritization of Zsim, prioritizing invalidations to avoid deadlocks between two adjacent cache levels, there might still be new deadlocks happen because we are communicating from a child to another child horizontally.

I guess a potential way to solve this, without violating current Zsim abstraction and structure, might be to add a network interface module where it virtualizes message types to meet existing requirements while essentially forwarding and packaging accesses and invalidations to help a network-NUCA co-design.

BTW, for how to implement a network in Zsim, #14 gave a way to implement a certain topology, without contention consideration; #117 seemed to provide a network design with dynamic contention model (M/D/1).