Closed seldridge closed 8 years ago
This is a deficiency in the design of the SimpleHellaCacheIF altogether--it shouldn't be performing an associative search at all!
It's not performing an associative search, it's simply checking the tag against the head of the queue to make sure it doesn't enqueue the replaying request again if the replayed request is itself nacked. In this case, since the MSB is being cut off, it thinks it's a different request being nacked and enqueues it. But that request cannot be dequeued, so it clogs up the queue.
This would be a problem anyway even if the replay queue wasn't using the tags. It breaks the expectation that the response tag is the same as the request tag. The solution is to either shrink the tag used by the RoCC accelerator or grow the tag used by the arbiter and cache. I'm more in favor of the former.
Problem
If the replay queue of SimpleHellaCacheIF (used for a RoCC) is provided with full-width tags, then that request will be unresolvable from the perspective of the replay queue.
The tag is 10 bits, but the LSBs of this are repurposed to encode the identity of the requester (core, ptw, or some RoCC). This nukes some number of MSBs, however, this happens after the replay queue has saved all 10 bits of the tag. When the request comes back from the cache the replay queue fails trying to match on an impossible 10-bit tag.
Details
Here a RoCC unit is generating a request with tag 0x2c0, the replay queue stores the tag as 0x2c0, and then tries to match on 0x0c0:
Possible Fix
This could be fixed by only storing the tag bits in the replay queue that will actually be used, e.g., changing the replay queue here:
This produces some wonky Verilog ("if X{} else {if X {}}"), but it does work.
If this makes sense, I can throw a pull request your way, but I expect there may be a better way to deal with this.