ucb-cyarp / vitis

Laminar - Optimizing DSP Compiler
BSD 3-Clause "New" or "Revised" License
3 stars 0 forks source link

Implement Performance Counter Telemetry Reporting Per Partition #55

Closed cyarp closed 3 years ago

cyarp commented 4 years ago

This will mainly be used when comparing telemetry to the model estimate of performance. Of particular interest are the number of instructions executed, the number of floating point ops executed, and the number of cycles executed.

Note that the Epyc system does not provide a AVX/SSE instruction counter but rather a FP OP counter. This means that the number of integer ops is not necessarily the number of instructions - FP Ops.

I did some digging and the PAPI library (which I think I may have come across before) came up. It did not provide counts of FP Ops for the Epyc system and I am in the process of adding it. PAPI provides a semi-model agnostic method of accessing performance counters (which is something we want) but also provides convenience functions for accessing perf counters native to the CPU (enabled by the kernel). Because of that, we may wind up specifying some native events that do not have PAPI equivalent generics.