intel / PerfSpect

System performance characterization tool based on linux perf
https://intel.github.io/PerfSpect/
BSD 3-Clause "New" or "Revised" License
333 stars 30 forks source link

metric missing #35

Closed jasonneverstop closed 1 year ago

jasonneverstop commented 1 year ago

I found that some metrics were missing in the results of version 1.2.10, such as 'metric_NUMA %_Reads addressed to local DRAM', 'metric_NUMA %_Reads addressed to remote DRAM', 'metric_UPI Data transmit BW (MB/sec) (only data)', and 'metric_memory bandwidth total (MB/sec)'. Could you please help me identify the reasons behind this issue? Thank you.

hilldani commented 1 year ago

could you please provide the perf-collect.log? These metrics will not get calculated if the tools is run inside of a VM where uncore PMU's are not exposed

jasonneverstop commented 1 year ago

The logs for version 1.2.10 are as follows perf-collect.log

2023-05-26 18:56:02,558 INFO: nmi_watchdog disabled!
2023-05-26 18:56:02,890 WARNING: Perf unsupported events not counted: ['cstate_core/c6-residency/;', 'cstate_pkg/c6-residency/;']
2023-05-26 18:56:02,919 INFO: PMUs not in use
2023-05-26 18:56:02,919 INFO: Collecting perf stat for events in : clx_skx.txt
2023-05-26 18:56:13,332 INFO: Collection complete! Calculating TSC frequency now
2023-05-26 18:56:14,387 INFO: perf stat dumped to perfstat.csv

perf-postprocess.log

2023-05-26 18:58:03,005 INFO: Generated results file(s) in: /home/rjx/perfspect-1.2.10
2023-05-26 18:58:03,005 INFO: Done!

In addition, version 1.2.9 has all metrics

hilldani commented 1 year ago

let me try to reproduce this

hilldani commented 1 year ago
2023-05-26 18:56:02,890 WARNING: Perf unsupported events not counted: ['cstate_core/c6-residency/;', 'cstate_pkg/c6-residency/;']

I have a suspicion of what might be the issue. Given that PerfSpect is saying your system does not support cstate events, your kernel might not support uncore events. In 1.2.10 we changed the way we detect uncore support from an MSR read to a filesystem read. Some older kernels (like centos 7) and perf don't support uncore even on a bare-metal instance. What is the output of ls /sys/devices?

jasonneverstop commented 1 year ago
2023-05-26 18:56:02,890 WARNING: Perf unsupported events not counted: ['cstate_core/c6-residency/;', 'cstate_pkg/c6-residency/;']

I have a suspicion of what might be the issue. Given that PerfSpect is saying your system does not support cstate events, your kernel might not support uncore events. In 1.2.10 we changed the way we detect uncore support from an MSR read to a filesystem read. Some older kernels (like centos 7) and perf don't support uncore even on a bare-metal instance. What is the output of ls /sys/devices?

Oh, I did use the CentOS 7 system and the command output is as follows:

# skylake  centos7 kernel 3.10 
# ls /sys/devices/
breakpoint  LNXSYSTM:00  pci0000:80  pnp0        uncore_cha_0   uncore_cha_13  uncore_cha_18  uncore_cha_22  uncore_cha_27  uncore_cha_7  uncore_iio_2  uncore_imc_2  uncore_irp_1  uncore_m2m_1    uncore_ubox
cpu         pci0000:00   pci0000:85  power       uncore_cha_1   uncore_cha_14  uncore_cha_19  uncore_cha_23  uncore_cha_3   uncore_cha_8  uncore_iio_3  uncore_imc_3  uncore_irp_2  uncore_m3upi_0  uncore_upi_0
intel_bts   pci0000:17   pci0000:ae  software    uncore_cha_10  uncore_cha_15  uncore_cha_2   uncore_cha_24  uncore_cha_4   uncore_cha_9  uncore_iio_4  uncore_imc_4  uncore_irp_3  uncore_m3upi_1  uncore_upi_1
intel_cqm   pci0000:3a   pci0000:d7  system      uncore_cha_11  uncore_cha_16  uncore_cha_20  uncore_cha_25  uncore_cha_5   uncore_iio_0  uncore_imc_0  uncore_imc_5  uncore_irp_4  uncore_m3upi_2  uncore_upi_2
intel_pt    pci0000:5d   platform    tracepoint  uncore_cha_12  uncore_cha_17  uncore_cha_21  uncore_cha_26  uncore_cha_6   uncore_iio_1  uncore_imc_1  uncore_irp_0  uncore_m2m_0  uncore_pcu      virtual

The CentOS 8 system has the same problem.

# icx lake  centos8.2  kernel 4.18 

# cat /etc/centos-release
CentOS Linux release 8.2.2004 (Core)
[root@jd perfspect-1.2.10]# cat perf-collect.log
2023-06-02 11:09:45,252 INFO: nmi_watchdog disabled!
2023-06-02 11:09:45,296 INFO: disabling uncore (possibly in a vm?)
2023-06-02 11:09:45,296 WARNING: Due to lack of vPMU support, TMA L1 events will not be collected
2023-06-02 11:09:45,350 WARNING: Perf unsupported events not counted: ['cstate_core/c6-residency/;', 'cstate_pkg/c6-residency/;']
2023-06-02 11:09:45,363 INFO: PMUs not in use
2023-06-02 11:09:45,363 INFO: Collecting perf stat for events in : icx.txt
2023-06-02 11:09:50,468 INFO: Collection complete! Calculating TSC frequency now
2023-06-02 11:09:51,508 INFO: perf stat dumped to perfstat.csv

[root@jd perfspect-1.2.10]# ls /sys/devices/
breakpoint  intel_bts  kprobe       msr         pci0000:16  pci0000:4a  pci0000:7e  pci0000:80  pci0000:83  pci0000:f9  pci0000:ff  pnp0      system      uprobe
cpu         intel_pt   LNXSYSTM:00  pci0000:00  pci0000:30  pci0000:64  pci0000:7f  pci0000:81  pci0000:e1  pci0000:fe  platform    software  tracepoint  virtual
hilldani commented 1 year ago

the centos 8 system does not support uncore because there's no uncore_... devices listed under /sys/devices/ (hence why you see it disabling uncore). I tested centos 7 baremetal ICX system and it worked fine, let me try a skylake

hilldani commented 1 year ago

so I ran a few tests in AWS with version 1.2.10:

my guess is centos 7 is so old that it hasn't been updated to support the uncore on newer hardware (ICX, SPR, etc.)

jasonneverstop commented 1 year ago

so I ran a few tests in AWS with version 1.2.10:

  • centos 7 (ami-08c191625cfb7ee61), ICX (m6i.metal): no uncore support
  • centos 7 (ami-08c191625cfb7ee61), SKX (m5.metal): supports uncore and calculates 'metric_NUMA %_Reads addressed to local DRAM', 'metric_NUMA %_Reads addressed to remote DRAM', 'metric_UPI Data transmit BW (MB/sec) (only data)', and 'metric_memory bandwidth total (MB/sec)'

my guess is centos 7 is so old that it hasn't been updated to support the uncore on newer hardware (ICX, SPR, etc.)

jasonneverstop commented 1 year ago

How can I obtain these metrics of icx under CentOS8?

hilldani commented 1 year ago

from the output of centos 8 ICX ls /sys/devices/ it looks like a lack of kernel support for ICX uncore. Given that centos 8 EOL'd in 2021 and ICX came out in 2021 I doubt it will get support. If you look at Kernel Support in the Readme, you'll see that newer architecture support gets backported to different kernel's on different distros depending on what version is being supported long term at the time of the architecture release. I know PerfSpect works on Ubuntu 16.04 and newer, centos 7, Amazon Linux 2, RHEL 9, and Debian 11 etc., but beyond that we can't do much to update EOL'd distro's

jasonneverstop commented 1 year ago

Oh, I got it. Thank you for your patient explanation.