google / benchmark

A microbenchmark support library
Apache License 2.0
8.94k stars 1.62k forks source link

[FR] Support Dynamic PMU detection #1377

Open damageboy opened 2 years ago

damageboy commented 2 years ago

Currently, on modern HW, where multiple PMU counters can be recorded for single run (example: Icaleake with 16 concurrent PMU counters, the code perf_counters.cc hard codes a limit of 3 counters globally.

I'd like to use libpfm's internal API to detect at runtime the PMU that each requested counter is associated with, and internally track how many counters are "consumed" from each PMU given the information retrieved from calling pfm_get_pmu_info() instead of the current hard-coded limit of 3 built into the code.

I opening this issue in preparation of providing a PR that would implement such logic, and wanted to see if this is something that needs more discussion / blessing before submitting a PR. I have already started some preliminary work on tracking the requested counters vs. the availability of each PMU.

dmah42 commented 2 years ago

i think @mtrofin was looking at something similar...

damageboy commented 2 years ago

OK, let me know if there is already something in progress, I think I might be able to get something into a PR form by the weekend if you guys like it?

mtrofin commented 2 years ago

I was looking to build internal switching, i.e. assuming the limit is N, but the user wants P = kN+r counters, allow them to specify P counters and then, internally, execute the workload k+1 times.

I believe this FR is orthogonal. One recommendation: please ensure the storage in PerfCounterValues is still inlined, to avoid risk of additional cache misses.

damageboy commented 2 years ago

@mtrofin thanks for pinging back. Yeah, your feature suggestion to do re-execute the workload until you get all the requested counters is both orthogonal and yet connected (in the sense that both would change the existing code base).

Is this something you started work on or just planning to at this point in time? I already have some basic code that tracks the counters and aggregates them into per-PMU and fixed/non-fixed counter "counts" for the "budgeting" aspects of this, I wouldn't mind re-writing if you have a more mature branch?

mtrofin commented 2 years ago

I don't have anything done.