Closed jasonneverstop closed 1 year ago
could you please provide the perf-collect.log? These metrics will not get calculated if the tools is run inside of a VM where uncore PMU's are not exposed
The logs for version 1.2.10 are as follows perf-collect.log
2023-05-26 18:56:02,558 INFO: nmi_watchdog disabled!
2023-05-26 18:56:02,890 WARNING: Perf unsupported events not counted: ['cstate_core/c6-residency/;', 'cstate_pkg/c6-residency/;']
2023-05-26 18:56:02,919 INFO: PMUs not in use
2023-05-26 18:56:02,919 INFO: Collecting perf stat for events in : clx_skx.txt
2023-05-26 18:56:13,332 INFO: Collection complete! Calculating TSC frequency now
2023-05-26 18:56:14,387 INFO: perf stat dumped to perfstat.csv
perf-postprocess.log
2023-05-26 18:58:03,005 INFO: Generated results file(s) in: /home/rjx/perfspect-1.2.10
2023-05-26 18:58:03,005 INFO: Done!
In addition, version 1.2.9 has all metrics
let me try to reproduce this
2023-05-26 18:56:02,890 WARNING: Perf unsupported events not counted: ['cstate_core/c6-residency/;', 'cstate_pkg/c6-residency/;']
I have a suspicion of what might be the issue. Given that PerfSpect is saying your system does not support cstate events, your kernel might not support uncore events. In 1.2.10 we changed the way we detect uncore support from an MSR read to a filesystem read. Some older kernels (like centos 7) and perf don't support uncore even on a bare-metal instance. What is the output of ls /sys/devices
?
2023-05-26 18:56:02,890 WARNING: Perf unsupported events not counted: ['cstate_core/c6-residency/;', 'cstate_pkg/c6-residency/;']
I have a suspicion of what might be the issue. Given that PerfSpect is saying your system does not support cstate events, your kernel might not support uncore events. In 1.2.10 we changed the way we detect uncore support from an MSR read to a filesystem read. Some older kernels (like centos 7) and perf don't support uncore even on a bare-metal instance. What is the output of
ls /sys/devices
?
Oh, I did use the CentOS 7 system and the command output is as follows:
# skylake centos7 kernel 3.10
# ls /sys/devices/
breakpoint LNXSYSTM:00 pci0000:80 pnp0 uncore_cha_0 uncore_cha_13 uncore_cha_18 uncore_cha_22 uncore_cha_27 uncore_cha_7 uncore_iio_2 uncore_imc_2 uncore_irp_1 uncore_m2m_1 uncore_ubox
cpu pci0000:00 pci0000:85 power uncore_cha_1 uncore_cha_14 uncore_cha_19 uncore_cha_23 uncore_cha_3 uncore_cha_8 uncore_iio_3 uncore_imc_3 uncore_irp_2 uncore_m3upi_0 uncore_upi_0
intel_bts pci0000:17 pci0000:ae software uncore_cha_10 uncore_cha_15 uncore_cha_2 uncore_cha_24 uncore_cha_4 uncore_cha_9 uncore_iio_4 uncore_imc_4 uncore_irp_3 uncore_m3upi_1 uncore_upi_1
intel_cqm pci0000:3a pci0000:d7 system uncore_cha_11 uncore_cha_16 uncore_cha_20 uncore_cha_25 uncore_cha_5 uncore_iio_0 uncore_imc_0 uncore_imc_5 uncore_irp_4 uncore_m3upi_2 uncore_upi_2
intel_pt pci0000:5d platform tracepoint uncore_cha_12 uncore_cha_17 uncore_cha_21 uncore_cha_26 uncore_cha_6 uncore_iio_1 uncore_imc_1 uncore_irp_0 uncore_m2m_0 uncore_pcu virtual
The CentOS 8 system has the same problem.
# icx lake centos8.2 kernel 4.18
# cat /etc/centos-release
CentOS Linux release 8.2.2004 (Core)
[root@jd perfspect-1.2.10]# cat perf-collect.log
2023-06-02 11:09:45,252 INFO: nmi_watchdog disabled!
2023-06-02 11:09:45,296 INFO: disabling uncore (possibly in a vm?)
2023-06-02 11:09:45,296 WARNING: Due to lack of vPMU support, TMA L1 events will not be collected
2023-06-02 11:09:45,350 WARNING: Perf unsupported events not counted: ['cstate_core/c6-residency/;', 'cstate_pkg/c6-residency/;']
2023-06-02 11:09:45,363 INFO: PMUs not in use
2023-06-02 11:09:45,363 INFO: Collecting perf stat for events in : icx.txt
2023-06-02 11:09:50,468 INFO: Collection complete! Calculating TSC frequency now
2023-06-02 11:09:51,508 INFO: perf stat dumped to perfstat.csv
[root@jd perfspect-1.2.10]# ls /sys/devices/
breakpoint intel_bts kprobe msr pci0000:16 pci0000:4a pci0000:7e pci0000:80 pci0000:83 pci0000:f9 pci0000:ff pnp0 system uprobe
cpu intel_pt LNXSYSTM:00 pci0000:00 pci0000:30 pci0000:64 pci0000:7f pci0000:81 pci0000:e1 pci0000:fe platform software tracepoint virtual
the centos 8 system does not support uncore because there's no uncore_... devices listed under /sys/devices/ (hence why you see it disabling uncore). I tested centos 7 baremetal ICX system and it worked fine, let me try a skylake
so I ran a few tests in AWS with version 1.2.10:
my guess is centos 7 is so old that it hasn't been updated to support the uncore on newer hardware (ICX, SPR, etc.)
so I ran a few tests in AWS with version 1.2.10:
- centos 7 (ami-08c191625cfb7ee61), ICX (m6i.metal): no uncore support
- centos 7 (ami-08c191625cfb7ee61), SKX (m5.metal): supports uncore and calculates 'metric_NUMA %_Reads addressed to local DRAM', 'metric_NUMA %_Reads addressed to remote DRAM', 'metric_UPI Data transmit BW (MB/sec) (only data)', and 'metric_memory bandwidth total (MB/sec)'
my guess is centos 7 is so old that it hasn't been updated to support the uncore on newer hardware (ICX, SPR, etc.)
How can I obtain these metrics of icx under CentOS8?
from the output of centos 8 ICX ls /sys/devices/
it looks like a lack of kernel support for ICX uncore. Given that centos 8 EOL'd in 2021 and ICX came out in 2021 I doubt it will get support. If you look at Kernel Support in the Readme, you'll see that newer architecture support gets backported to different kernel's on different distros depending on what version is being supported long term at the time of the architecture release. I know PerfSpect works on Ubuntu 16.04 and newer, centos 7, Amazon Linux 2, RHEL 9, and Debian 11 etc., but beyond that we can't do much to update EOL'd distro's
Oh, I got it. Thank you for your patient explanation.
I found that some metrics were missing in the results of version 1.2.10, such as 'metric_NUMA %_Reads addressed to local DRAM', 'metric_NUMA %_Reads addressed to remote DRAM', 'metric_UPI Data transmit BW (MB/sec) (only data)', and 'metric_memory bandwidth total (MB/sec)'. Could you please help me identify the reasons behind this issue? Thank you.