andrewrk / poop

Performance Optimizer Observation Platform
MIT License
788 stars 50 forks source link

Why the difference between the counter values reported by poop and perf-stat? #42

Closed aburdulescu closed 10 months ago

aburdulescu commented 10 months ago

For example:

% perf stat -r5 -e instructions,cycles,cache-references,cache-misses,branches,branch-misses ls
build.zig  LICENSE  README.md  src  zig-cache  zig-out
build.zig  LICENSE  README.md  src  zig-cache  zig-out
build.zig  LICENSE  README.md  src  zig-cache  zig-out
build.zig  LICENSE  README.md  src  zig-cache  zig-out
build.zig  LICENSE  README.md  src  zig-cache  zig-out

 Performance counter stats for 'ls' (5 runs):

         1.696.322      instructions              #    1,10  insn per cycle           ( +-  0,21% )
         1.432.362      cycles                                                        ( +-  9,94% )
           144.573      cache-references                                              ( +-  3,93% )
            35.966      cache-misses              #   26,049 % of all cache refs      ( +-  1,02% )
           363.919      branches                                                      ( +-  0,17% )
            10.495      branch-misses             #    2,89% of all branches          ( +-  1,89% )

          0,001857 +- 0,000170 seconds time elapsed  ( +-  9,18% )
% ./zig-out/bin/poop ls                           
Benchmark 1 (3316 runs): ls
  measurement          mean ± σ            min … max           outliers
  wall_time          1.45ms ±  102us     723us … 2.27ms        323 (10%)        
  peak_rss           2.86MB ± 60.2KB    2.72MB … 2.99MB          0 ( 0%)        
  cpu_cycles          372K  ± 37.1K      347K  …  946K         361 (11%)        
  instructions        421K  ± 61.0       421K  …  421K           0 ( 0%)        
  cache_references   28.0K  ± 2.60K     22.5K  … 34.7K           0 ( 0%)        
  cache_misses       9.02K  ±  482      8.07K  … 10.8K           0 ( 0%)        
  branch_misses      4.72K  ± 67.5      3.99K  … 5.04K         175 ( 5%)

You can see above that, compared with perf stat, poop gives lower values for each hardware counter. I'm not that familiar with the perf API and how each tool is using it but I would expected the same(or close enough) values reported by both tools if the same command is measured.

aburdulescu commented 10 months ago

Found the reason: https://github.com/andrewrk/poop/blob/65ee26421393283c6f7211c309ee3718e917af7b/src/main.zig#L195

perf-stat gives the same output if --all-user flag is used.