chipsalliance / rocket-chip

Rocket Chip Generator
Other
3.28k stars 1.14k forks source link

[rocket] Using all dcache tag bits causes the replay queue to get clogged #192

Closed seldridge closed 8 years ago

seldridge commented 8 years ago

Problem

If the replay queue of SimpleHellaCacheIF (used for a RoCC) is provided with full-width tags, then that request will be unresolvable from the perspective of the replay queue.

The tag is 10 bits, but the LSBs of this are repurposed to encode the identity of the requester (core, ptw, or some RoCC). This nukes some number of MSBs, however, this happens after the replay queue has saved all 10 bits of the tag. When the request comes back from the cache the replay queue fails trying to match on an impossible 10-bit tag.

Details

  1. The replay queue gets a full-width tag and stores it: https://github.com/ucb-bar/rocket/blob/master/src/main/scala/nbdcache.scala#L1194
  2. The arbiter appends the arbiter index bits nuking the MSBs: https://github.com/ucb-bar/rocket/blob/master/src/main/scala/arbiter.scala#L35
  3. Response comes back with nuked MSBs and the replay queue is trying to match on a tag that cannot exist: https://github.com/ucb-bar/rocket/blob/master/src/main/scala/nbdcache.scala#L1171

Here a RoCC unit is generating a request with tag 0x2c0, the replay queue stores the tag as 0x2c0, and then tries to match on 0x0c0:

[INFO] MyRoCC: Mem request from core with tag 0x2c0 for addr 0x008ffffac0
[DEBUG] ReplayQ: valid response with tag 0x0c0, no hit!
// Dumping out the tags
[DEBUG] ReplayQ: inflight[0].tag: 0x2b8
[DEBUG] ReplayQ: inflight[1].tag: 0x2c0

Possible Fix

This could be fixed by only storing the tag bits in the replay queue that will actually be used, e.g., changing the replay queue here:

  when (io.req.fire()) {
    val arbBits = 1 + log2Up(p(BuildRoCC).size + (if (p(UseVM)) 1 else 0))
    reqs(next_inflight) := io.req.bits
    reqs(next_inflight).tag := io.req.bits.tag(coreDCacheReqTagBits - arbBits - 1, 0)
  }

This produces some wonky Verilog ("if X{} else {if X {}}"), but it does work.

If this makes sense, I can throw a pull request your way, but I expect there may be a better way to deal with this.

aswaterman commented 8 years ago

This is a deficiency in the design of the SimpleHellaCacheIF altogether--it shouldn't be performing an associative search at all!

zhemao commented 8 years ago

It's not performing an associative search, it's simply checking the tag against the head of the queue to make sure it doesn't enqueue the replaying request again if the replayed request is itself nacked. In this case, since the MSB is being cut off, it thinks it's a different request being nacked and enqueues it. But that request cannot be dequeued, so it clogs up the queue.

This would be a problem anyway even if the replay queue wasn't using the tags. It breaks the expectation that the response tag is the same as the request tag. The solution is to either shrink the tag used by the RoCC accelerator or grow the tag used by the arbiter and cache. I'm more in favor of the former.