bytecodealliance / sightglass

A benchmark suite and tool to compare different implementations of the same primitives.
Apache License 2.0
69 stars 33 forks source link

Getting performance counters to work #257

Closed matsbror closed 1 year ago

matsbror commented 1 year ago

In a previous issue I reported that I found cycle times of sightglass to be magnitudes longer on x86 compared to ARM and RISCV. I then want to use performance counters instead, but get only 0 results. I made sure that the test in counters.rs reports some data so I know that the interface works. But when I run a benchmark all I get is:

compilation
  benchmarks/blake3-scalar/benchmark.wasm
    cache-accesses
      [0 0.00 0] engines/wasmtime/libengine.so
    cache-misses
      [0 0.00 0] engines/wasmtime/libengine.so
    cpu-cycles
      [0 0.00 0] engines/wasmtime/libengine.so
    instructions-retired
      [0 0.00 0] engines/wasmtime/libengine.so

Any pointers to what could be the cause of this?

matsbror commented 1 year ago

I have tracked down the issue to the recording of the performance counter values. In the test method, meaningful counter values are read, but when run normally, they are all zero.

abrown commented 1 year ago

I don't get a great sense for what is going wrong here: does the perf-counters measure work on x86 but not on some other architecture? Is your sense that there is a problem with the perf_event crate, e.g., here? How would one replicate this?

matsbror commented 1 year ago

I am going to close this now. When I used --pin performance counters returned all 0, when I removed it, the counters worked.

abrown commented 1 year ago

Ok, maybe there's something wrong with the core_affinity crate, which does this work, or maybe we're using it wrong. Looking at bind_to_single_core it appears that we are trying to change the affinity to the last available core ID. Maybe this has some interaction with perf? (I would have thought this would be fine). The reason I never see this is that I usually run sightglass under something like taskset --cpu-list ..., which I have had better luck with.