intel / pcm

Intel® Performance Counter Monitor (Intel® PCM)
BSD 3-Clause "New" or "Revised" License
2.82k stars 476 forks source link

Shows -`nan` in output #811

Closed paulmenzel closed 2 months ago

paulmenzel commented 3 months ago

On the Intel Kaby Lake laptop Dell XPS 13 9360, -nan is shown in the output:

$ uname -a
Linux abreu 6.11.0-rc2-00315-g7006fe2f7f78 #263 SMP PREEMPT_DYNAMIC Sun Aug 11 21:32:49 CEST 2024 x86_64 GNU/Linux
$ sudo dmesg | grep -e "DMI:" -e "Linux version" -e microcode -e "smpboot: CPU0"
[    0.000000] Linux version 6.11.0-rc2-00315-g7006fe2f7f78 (build@bohemianrhapsody.molgen.mpg.de) (gcc (Debian 13.3.0-2) 13.3.0, GNU ld (GNU Binutils for Debian) 2.42.50.20240710) #263 SMP PREEMPT_DYNAMIC Sun Aug 11 21:32:49 CEST 2024
[    0.000000] DMI: Dell Inc. XPS 13 9360/0596KF, BIOS 2.21.0 06/02/2022
[    0.000000] DMI: Memory slots populated: 2/2
[    0.063296] smpboot: CPU0: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz (family: 0x6, model: 0x8e, stepping: 0x9)
[    0.354339] microcode: Current revision: 0x000000f4
[    0.354341] microcode: Updated early from: 0x000000f0
$ git log --oneline --no-decorate -1
db66fd7 Merge pull request #809 from intel/push-2024-08-09
$ sudo modprobe msr
$ sudo ./bin/pcm
 UTIL  : utlization (same as core C0 state active state residency, the value is in 0..1) 
 IPC   : instructions per CPU cycle
 CFREQ : core frequency in Ghz
 L3MISS: L3 (read) cache misses 
 L3HIT : L3 (read) cache hit ratio (0.00-1.00)
 L3MPI : number of L3 (read) cache misses per instruction
 L2MPI : number of L2 (read) cache misses per instruction
 READ  : bytes read from main memory controller (in GBytes)
 WRITE : bytes written to main memory controller (in GBytes)
 IO    : bytes read/written due to IO requests to memory controller (in GBytes); this may be an over estimate due to same-cache-line partial requests
 IA    : bytes read/written due to IA requests to memory controller (in GBytes); this may be an over estimate due to same-cache-line partial requests
 GT    : bytes read/written due to GT requests to memory controller (in GBytes); this may be an over estimate due to same-cache-line partial requests
 TEMP  : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
 energy: Energy in Joules

 Core (SKT) | UTIL | IPC  | CFREQ | L3MISS | L2MISS | L3HIT | L3MPI | L2MPI |  TEMP

   0    0     0.00   -1.00    -0.00       0        0      0.00  -nan  -nan     55
   1    0     0.00   -1.00    -0.00       0        0      0.00  -nan  -nan     55
   2    0     0.00   -1.00    -0.00       0        0      0.00  -nan  -nan     55
   3    0     0.00   -1.00    -0.00       0        0      0.00  -nan  -nan     55
---------------------------------------------------------------------------------------------------------------
 SKT    0     0.00   -1.00    -0.00       0        0      0.00  -nan  -nan     53
---------------------------------------------------------------------------------------------------------------
 TOTAL  *     0.00   -1.00    -0.00       0        0      0.00  -nan  -nan     N/A

 Instructions retired:    0   ; Active cycles:    0   ; Time (TSC): 2906 Mticks ; C0 (active,non-halted) core residency: 0.00 %

 C1 core residency: 14.97 %; C3 core residency: 0.13 %; C6 core residency: 1.46 %; C7 core residency: 83.43 %;
 C0 package residency: 24.07 %; C2 package residency: 14.29 %; C3 package residency: 61.64 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %; C8 package residency: 0.00 %; C9 package residency: 0.00 %; C10 package residency: 0.00 %;
                             ┌────────────────────────────────────────────────────────────────────────────────┐
 Core    C-state distribution│11111111111167777777777777777777777777777777777777777777777777777777777777777777│
                             └────────────────────────────────────────────────────────────────────────────────┘
                             ┌───────────────────────────────────────────────────────────────────────────────┐
 Package C-state distribution│0000000000000000000222222222223333333333333333333333333333333333333333333333333│
                             └───────────────────────────────────────────────────────────────────────────────┘
---------------------------------------------------------------------------------------------------------------

MEM (GB)->|  READ |  WRITE |   IO   |   IA   |   GT   | CPU energy | PP0 energy | PP1 energy |
---------------------------------------------------------------------------------------------------------------
 SKT   0     0.55     0.25     0.17     0.44     0.19       1.67       0.24       0.15
---------------------------------------------------------------------------------------------------------------
^CDEBUG: caught signal to interrupt (Interrupt).
Cleaning up
 Closed perf event handles
 Zeroed uncore PMU registers
 Re-enabling NMI watchdog.
rdementi commented 3 months ago

it could be an issue with the perf_event driver with your CPU. Could you please try this:

su
export PCM_NO_PERF=1
pcm -r
paulmenzel commented 3 months ago
$ sudo PCM_NO_PERF=1 ./bin/pcm -r

 Intel(r) Performance Counter Monitor ($Format:%ci ID=%h$)

=====  Processor information  =====
Linux arch_perfmon flag  : yes
Hybrid processor         : no
IBRS and IBPB supported  : yes
STIBP supported          : yes
Spec arch caps supported : yes
Max CPUID level          : 22
CPU model number         : 142
Number of physical cores: 2
Number of logical cores: 4
Number of online logical cores: 4
Threads (logical cores) per physical core: 2
Num sockets: 1
Physical cores per socket: 2
Last level cache slices per socket: 2
Core PMU (perfmon) version: 4
Number of core PMU generic (programmable) counters: 3
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 2900000000 Hz
IBRS enabled in the kernel   : no
STIBP enabled in the kernel  : no
The processor is not susceptible to Rogue Data Cache Load: no
The processor supports enhanced IBRS                     : no
Package thermal spec power: 15 Watt; Package minimum power: 0 Watt; Package maximum power: 0 Watt;

INFO: Linux perf interface to program uncore PMUs is present
Socket 0: 0 PCU units detected. 0 IIO units detected. 0 IRP units detected. 0 CHA/CBO units detected. 0 MDF units detected. 0 UBOX units detected. 0 CXL units detected. 0 PCIE_GEN5x16 units detected. 0 PCIE_GEN5x8 units detected.

 Resetting PMU configuration
 Zeroed PMU registers
 Disabling NMI watchdog since it consumes one hw-PMU counter. To keep NMI watchdog set environment variable PCM_KEEP_NMI_WATCHDOG=1 (this reduces the core metrics set)
 Closed perf event handles
Trying to use Linux perf events...
Usage of Linux perf events is disabled through PCM_NO_PERF environment variable. Using direct PMU programming...
WARNING: Custom counter 0 is in use. MSR_PERF_GLOBAL_INUSE on core 0: 0x8000000000000009
WARNING: Core 0 IA32_PERFEVTSEL0_ADDR is not zeroed 1245244
WARNING: Custom counter 0 is in use. MSR_PERF_GLOBAL_INUSE on core 1: 0x8000000000000009
WARNING: Core 1 IA32_PERFEVTSEL0_ADDR is not zeroed 1245244
WARNING: Custom counter 0 is in use. MSR_PERF_GLOBAL_INUSE on core 2: 0x8000000000000009
WARNING: Core 2 IA32_PERFEVTSEL0_ADDR is not zeroed 1245244
WARNING: Custom counter 0 is in use. MSR_PERF_GLOBAL_INUSE on core 3: 0x8000000000000009
WARNING: Core 3 IA32_PERFEVTSEL0_ADDR is not zeroed 1245244

Detected Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz "Intel(r) microarchitecture codename Kabylake/Whiskey Lake" stepping 9 microcode level 0xf4

 UTIL  : utlization (same as core C0 state active state residency, the value is in 0..1) 
 IPC   : instructions per CPU cycle
 CFREQ : core frequency in Ghz
 L3MISS: L3 (read) cache misses 
 L3HIT : L3 (read) cache hit ratio (0.00-1.00)
 L3MPI : number of L3 (read) cache misses per instruction
 L2MPI : number of L2 (read) cache misses per instruction
 READ  : bytes read from main memory controller (in GBytes)
 WRITE : bytes written to main memory controller (in GBytes)
 IO    : bytes read/written due to IO requests to memory controller (in GBytes); this may be an over estimate due to same-cache-line partial requests
 IA    : bytes read/written due to IA requests to memory controller (in GBytes); this may be an over estimate due to same-cache-line partial requests
 GT    : bytes read/written due to GT requests to memory controller (in GBytes); this may be an over estimate due to same-cache-line partial requests
 TEMP  : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
 energy: Energy in Joules

 Core (SKT) | UTIL | IPC  | CFREQ | L3MISS | L2MISS | L3HIT | L3MPI | L2MPI |  TEMP

   0    0     0.03   0.69    1.54      33 K     59 K    0.41  0.0012  0.0022     54
   1    0     0.01   0.30    0.98      10 K     17 K    0.38  0.0047  0.0082     55
   2    0     0.01   0.50    1.11      25 K     40 K    0.35  0.0034  0.0053     54
   3    0     0.02   0.70    1.18      31 K     57 K    0.44  0.0019  0.0036     55
---------------------------------------------------------------------------------------------------------------
 SKT    0     0.02   0.62    1.28     101 K    175 K    0.41  0.0019  0.0033     51
---------------------------------------------------------------------------------------------------------------
 TOTAL  *     0.02   0.62    1.28     101 K    175 K    0.41  0.0019  0.0033     N/A

 Instructions retired:   53 M ; Active cycles:   85 M ; Time (TSC): 2909 Mticks ; C0 (active,non-halted) core residency: 1.66 %

 C1 core residency: 2.57 %; C3 core residency: 0.05 %; C6 core residency: 1.06 %; C7 core residency: 94.66 %;
 C0 package residency: 8.46 %; C2 package residency: 23.43 %; C3 package residency: 1.93 %; C6 package residency: 2.47 %; C7 package residency: 0.02 %; C8 package residency: 63.69 %; C9 package residency: 0.00 %; C10 package residency: 0.00 %;
                             ┌────────────────────────────────────────────────────────────────────────────────┐
 Core    C-state distribution│01167777777777777777777777777777777777777777777777777777777777777777777777777777│
                             └────────────────────────────────────────────────────────────────────────────────┘
                             ┌─────────────────────────────────────────────────────────────────────────────────┐
 Package C-state distribution│000000022222222222222222223366888888888888888888888888888888888888888888888888888│
                             └─────────────────────────────────────────────────────────────────────────────────┘
---------------------------------------------------------------------------------------------------------------

MEM (GB)->|  READ |  WRITE |   IO   |   IA   |   GT   | CPU energy | PP0 energy | PP1 energy |
---------------------------------------------------------------------------------------------------------------
 SKT   0     0.46     0.23     0.25     0.23     0.20       1.00       0.10       0.15
---------------------------------------------------------------------------------------------------------------
[…]
rdementi commented 2 months ago

this seems to be resolved with "export PCM_NO_PERF=1". Closing