google / benchmark

A microbenchmark support library
Apache License 2.0
8.61k stars 1.57k forks source link

Fix broken PFM-enabled tests #1623

Closed macandy13 closed 1 year ago

macandy13 commented 1 year ago

Performance counter tests are broken (not sure if they ever worked or got broken over the last couple of prs). Sadly the way to fix most of them was to reduce test coverage quite a bit, but IMO the tests in this state are still ok to have around. Ideally a test using more than 6 HW counters would be great to exercise the grouping mechanism, but I couldn't find enough HW counters that can be used from a non-privileged test.

This was uncovered when trying to enable PFM test as part of github workflows. That sadly is infeasible because the CI machines do not support performance counters at all, so the tests will always fail.

HFTrader commented 10 months ago

I did some work on performance counters and yes, the performance counters tests were broken since then, more than a year ago, because they were built under unreasonable assumptions on how performance counters work.

On Github actions, and in any virtual machine in general, hw and hw/cache performance counters are generally not available. This is a feature of virtual machines, not a problem with the code itself.

I can't localize the original comments but my understanding after questioning is that those performance counter tests were there so anyone with a real machine and respective permissions could run the tests and verify that they were correct.

Note that even with all permissions is still possible for the tests to fail given the wild variety of hardware and cache profiles even under the Intel line.

However removing the code as was done takes away a tool that future developers would have to verify their work.

macandy13 commented 6 months ago

Sorry for the insanely late reply, @HFTrader. The reason I took these tests away was that it was very unclear under which circumstances they should work, so in a sense they are useless to some amount of people, depending on HW.

I guess we could bring them back in a different form - my first naive idea would be some binary that can be used to see which performance counters exist on the machine. But you might have better ideas.

Happy to collaborate on this, and apologies again for replying ultra-late.

HFTrader commented 6 months ago

I have so many things on my plate right now, I wouldn't be able to collaborate, realistically. Even then, if those tests are not benefitting anybody, what's the point.