Closed marmeladema closed 4 years ago
Hi,
Just to confirm that I can reproduce that.
$ ./target/debug/examples/group
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51]
L1D cache misses/references: 110 / 10904 (1%)
branch prediction misses/total: 182 / 8447 (2%)
Counter id 218 has value 10904
Counter id 219 has value 110
Counter id 220 has value 8447
Counter id 221 has value 182
$ vim examples/group.rs
$ cargo build --examples
Compiling perf-event v0.4.2 (/tmp/perf-event)
warning: unused variable: `cycles`
--> examples/group.rs:17:9
|
17 | let cycles = Builder::new().group(&group).kind(Hardware::CPU_CYCLES).build()?;
| ^^^^^^ help: consider prefixing with an underscore: `_cycles`
|
= note: `#[warn(unused_variables)]` on by default
Finished dev [unoptimized + debuginfo] target(s) in 0.37s
$ ./target/debug/examples/group
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51]
L1D cache misses/references: 0 / 0 (NaN%)
branch prediction misses/total: 0 / 0 (NaN%)
Counter id 223 has value 0
Counter id 224 has value 0
Counter id 225 has value 0
Counter id 226 has value 0
Counter id 227 has value 0
Also perf seems to handle that fine:
$ sudo perf stat -e cpu-cycles,instructions,cache-misses -a sleep 10
Performance counter stats for 'system wide':
3,203,979,163 cpu-cycles
2,737,470,443 instructions # 0.85 insn per cycle
28,993,403 cache-misses
10.003352898 seconds time elapsed
Thanks
Thanks for the bug report - sorry for the slow reply!
If I remove other counters from that group, then I can add Hardware::CPU_CYCLES
to the group and still get counts. It seems to be a problem with the group being large?
I am not seeing any error codes returned by the kernel. However, it does say that the period of time for which the group was enabled was zero, which suggests that we're running into multiplexing:
Total time the event was enabled and running. Normally these values are the same. If more events are started, then available counter slots on the PMU, then multiplexing happens and events run only part of the time. In that case, the time_enabled and time running values can be used to scale an estimated value for the count.
If you make the CPU_CYCLES
counter an independent counter, and don't include it in the group, then it works.
This is a kernel limitation, not a problem with the library. From the perf kernel documentation:
Globally pinned events can limit the number of counters available for other groups. On x86 systems, the NMI watchdog pins a counter by default. The nmi watchdog can be disabled as root with
echo 0 > /proc/sys/kernel/nmi_watchdog
If I disable the NMI watchdog as suggested, then a group that contains the CPU_CYCLES
counter works fine.
I'll make the documentation mention this.
The Linux perf
utility seems to be able to recognize when this has happened, and suggest disabling the watchdog (that's how I figured this out). I don't really understand how it knows when something has gone wrong; there are no errors returned from the kernel when the groups
example isn't working. The perf source code responsible for the hint isn't clear to me.
The Linux
perf
utility seems to be able to recognize when this has happened, and suggest disabling the watchdog (that's how I figured this out). I don't really understand how it knows when something has gone wrong; there are no errors returned from the kernel when thegroups
example isn't working. The perf source code responsible for the hint isn't clear to me.
This may be covered by #5.
Hello!
First of all, thank you very much for this crate, its exactly what i was looking for.
I was trying to use various counters and I noticed that some counters appear to be incompatible. For example, I tried to add the
CPU_CYCLES
counter to https://github.com/jimblandy/perf-event/blob/master/examples/group.rs and suddenly all values are0
:Is this a known issue of this crate that can be fix? If so, I'd be happy to work on a PR with some guidance. I believe, but i might be wrong, that
perf
itself supports it?Thank you again!