Closed skyreflectedinmirrors closed 1 year ago
It was pointed out that L1<->L2 bandwidth is a better naming scheme for this, as this excludes instruction cache and scalar cache traffic.
Is L1D Cache Accesses panel the only place we want to add this @arghdos ? To summarize, I:
L1-TCR
-> L1-TCC
L1-L2 BW
using:
L1_L2_BW = 64B * (TCP_TCC_READ_REQ_sum + TCP_TCC_WRITE_REQ_sum + TCP_TCC_ATOMIC_WITH_RET_REQ_sum + TCP_TCC_ATOMIC_WITHOUT_RET_REQ_sum) / $denom
Before:
After:
Changed name of L1-TCR -> L1-TCC
Change to match the others (L1-L2)
L1-L2 BW
I would prefer we separate the requests and the bandwidth value (right now, you do L1-L2 Read Requests, L1-L2 Bandwidth, L1-L2 Write requests...)
Additionally, make sure the the units the L1-L2 BW are Bytes per $denom, not Requests per $denom, as they are now
Merged fixes. Closing ticket.
Omniperf currently does not report the achieved L2 bandwidth from the L1s, despite collecting the counters required to do so. Following the convention for L1 bandwidth calculations, this is essentially the total amount of data moved from L1<->L2, which can be calculated from the L1<->L2 requests, e.g.:
https://github.com/AMDResearch/omniperf/blob/main/src/omniperf_cli/configs/gfx90a/1600_L1_cache.yaml#L173
The L2 bandwidth calculation would be: