Closed sxu55 closed 7 years ago
The "replay" in L1 D$ is related to the non-blocking cache's missing handlers. So when an access is missed in L1 D$, one of the missing handlers is initiated to fetch this missing cache line. The cache is non-blocking, so the core pipeline goes on. When the line is fetched, the original access needs to be redone. Then a "replay" request is initiated by the missing handler to the original D$ pipeline. Partially this also serves the logic to avoid a cache miss replay to conflict with a register writeback in the WB stage.
As for the AMO support in L2, in our current version of Rocket it is not used. L1 D$ is write allocated. So the AMO instructions are purely served by L1. Moving AMO operation to L2 might allows better performance if two cores constantly writing to the same cache line. Also this might be used to enforce orders for memory consistency models. However, the above are my personal speculation.
The above two questions are related to the Rocket core/chip, so it might be better to be submitted to the freechipsproject/rocket-chip repo instead.
Thanks a lot. Really appreciate your help
I have two questions when looking at the rocket code:
I noticed that L1 cache in rocket core has something called "replay", but I am not sure what this is for. Could anyone share some info about it?
Also, I noticed that the L2HellaCache has some atomic operation support (including an AMOALU each bank). But I am wondering in what case will L2 be handling AMOs instead of core L1s handling by themselves.
Thanks!