jdmccalpin / low-overhead-timers

Very low-overhead timer/counter interfaces for C on Intel 64 processors.
BSD 3-Clause "New" or "Revised" License
116 stars 16 forks source link

Should provide both reference and actual cycles in a single call #2

Closed hyc closed 6 years ago

hyc commented 6 years ago

Since you're most likely going to need both rdpmc_actual_cycles() and rdpmc_reference_cycles() at the same time, to detect HALT intervals, there should be a function that returns both of them in one call.

jdmccalpin commented 6 years ago

If these were being used in kernel space, it would be possible (and desirable) to disable interrupts and collect multiple values atomically. In user space, this is not possible, and (especially with inlining) the overhead of having two consecutive function invocations should not result in a detectable increase in the number of interrupts that occur between the two RDPMC instructions.
Creating a single inline assembly macro that executes RDPMC using each of the fixed-function performance counters should make negligible difference to the overhead (in cycles), which is dominated by the execution of the microcode in the RDPMC instructions themselves.
If you have a processor where you can demonstrate a noticeable difference in overhead by doing both RDPMCs in one function invocation, I would consider including it. In the absence of a significant difference in overhead, keeping the interface easy to remember seems like a higher priority.