Open s9105947 opened 2 years ago
This issue is from now on only concerned with the how and why of topdown metrics in lo2s (with a focus on alder lake), alder lake support i.,g. has been moved to another issue.
The two possibilities to explore here are (besides not implementing at all due to cost):
I have looked at the example code for alder lake and the documentation at tools/perf/Documentation/topdown.txt
and it is practically identical to how you can read topdown events Ice Lake.
There seem to be two kind of ways how topdown can be accessed:
In this access mode a normal perf event exists for every topdown metric like cpu/topdown-fetch-bubbles
.
Support for this is available on every architecture from Skylake up, including Alder Lake. (However Alder Lake itself suffers from lo2s not being hybrid-aware currently)
This is what the topdown code from your Alder Lake example and tools/perf/Documentation/topdown.txt
does.
With Ice Lake, a new way of reading topdown metrics came to be, which utilizes the rdpmc instruction from userspace.
The documentation states, that this is faster, but on the other hand it only works on Intel CPUs Ice Lake up and requires a custom metric set-up.
As discovered in the PR for Alder Lake support. The individual topdown metrics are not available as simple perf events on Alder Lake.
Dear maintainers,
I tried to perform Top-Down Microarchitecture Analysis on an Alder Lake processor using lo2s. However, lo2s does not find all required counters.
Steps to reproduce
Local test system
kallisto
;Show available counters:
Now list all counters found by lo2s with
lo2s --list-events | grep topdown
Expected Result
All counters above are displayed. (8 for cpu_core, 4 for cpu_atom)
Actual Result
Only the cpu_atom (E-Core) counters are found
Additional Notes
This might be caused by the hybrid architecture of Alder Lake, having "P-cores" and "E-cores". Hence, the sysfs file which typically reside in
/sys/devices/cpu
are split into/sys/devices/cpu_core
(P-cores) and/sys/devices/cpu_atom
(E-cores). Perhaps lo2s does not find all required files? (Notably other cpu_core events are found, e.g.cpu_core/slots/
)