RRZE-HPC / likwid

Performance monitoring and benchmarking suite
https://hpc.fau.de/research/tools/likwid/
GNU General Public License v3.0
1.67k stars 229 forks source link

[DOCS] Does LIKWID counts various events while in kernel mode? #492

Closed ibogosavljevic closed 1 month ago

ibogosavljevic commented 2 years ago

What are you searching for Does Linux kernel save/restore the value of INSTR_RETIRED_ANY when entering/exiting kernel space? I did some experiments, and it seems that the kernel saves the value of INSTR_RETIRED_ANY when entering kernel mode and restores it when entering user mode. Can you confirm?

TomTheBear commented 2 years ago

As far as I know, the counters are only saved/restored if perf_event is active on the hardware thread but it might have changed. In case of LIKWID's accessdaemon and direct mode, perf_event is not involved, so hopefully the counts are not affected. This changes of course if another process running on the same hardware thread is using perf_event. In perf_event mode, LIKWID relies on perf_event to properly count the kernel mode (if configured). So it depends on the setup of your tests. Have you tested it in/with LIKWID or independently (e.g. with an own kernel module)?

If the kernel watchdog is running, perf_event is active on all cores but uses a cycle event. direct and accessdaemon mode disable the watchdog in the startup phase and enable it again in the end to avoid any confusion.

Moreover, while LIKWID provides the option to count in kernel mode, it is not the default mode, with a reason. I never really used it for research. It's unpredictable what the kernel does when context switches happen (bottom half handlers, ...), so every time I wanted to measure something at kernel level, I wrote a "kernel application" (aka kernel module) with directly accessing the counters. I have never used perf_event inside the kernel yet.

It is quite tedious to investigate the perf_event operations in the kernel and I don't track changes to the perf_event subsystem, so I cannot really confirm your findings.

TomTheBear commented 1 month ago

I added some comments about counting in kernel-mode in the docs. Can the issue be closed or are there further questions?

ibogosavljevic commented 1 month ago

No more questions