Open kadircs opened 4 months ago
I assume perf_event
backend and suspect a too high setting in /proc/sys/kernel/perf_event_paranoid
. It has to be zero to get data from the Uncore devices. Run with -V 1
and there should be a message.
The above situation also occurs on AMD 9554.
(note: I made a sum statistics data output, so the runtime is 384)
Runtime (RDTSC) [s]: 384.015717
Runtime unhalted [s]: 0.058734
Clock [MHz]: 199629.250000
CPI: nan
Memory bandwidth (channels 0-3) [MBytes/s]: 0.000000
Memory data volume (channels 0-3) [GBytes]: 0.000000
----------------------------
Runtime (RDTSC) [s]: 384.044250
Runtime unhalted [s]: 0.041370
Clock [MHz]: 186808.812500
CPI: nan
Memory bandwidth (channels 0-3) [MBytes/s]: 0.000000
Memory data volume (channels 0-3) [GBytes]: 0.000000
----------------------------
Runtime (RDTSC) [s]: 384.012970
Runtime unhalted [s]: 0.113669
Clock [MHz]: 187998.656250
CPI: nan
Memory bandwidth (channels 0-3) [MBytes/s]: 0.000000
Memory data volume (channels 0-3) [GBytes]: 0.000000
----------------------------
Runtime (RDTSC) [s]: 384.045624
Runtime unhalted [s]: 0.561691
Clock [MHz]: 191052.828125
CPI: nan
Memory bandwidth (channels 0-3) [MBytes/s]: 0.000000
Memory data volume (channels 0-3) [GBytes]: 0.000000
I assume
perf_event
backend and suspect a too high setting in/proc/sys/kernel/perf_event_paranoid
. It has to be zero to get data from the Uncore devices. Run with-V 1
and there should be a message.
I attempted this, but it seems to have been ineffective.
What has been ineffective? Setting the value to zero or getting messages?
LIKWID with perf_event
backend requires the unit amd_df
to be present (/sys/devices/amd_df
). If this device does not exist, there is no chance to get the memory traffic through perf_event
and consequently LIKWID. You need a newer or patched kernel.
I encountered the same problem, os is Rocky linux 8.6 kernel version: 4.18.0-372.9.1.el8.x86_64 /proc/sys/kernel/perf_event_paranoid is 0 /sys/device/amd_df and /sys/device/amd_l3 has existed
[root@localhost bin]# grep -i perf_event /boot/config-4.18.0-372.9.1.el8.x86_64 CONFIG_HAVE_PERF_EVENTS=y CONFIG_PERF_EVENTS=y CONFIG_HAVE_PERF_EVENTS_NMI=y CONFIG_PERF_EVENTS_INTEL_UNCORE=m CONFIG_PERF_EVENTS_INTEL_RAPL=m CONFIG_PERF_EVENTS_INTEL_CSTATE=m CONFIG_PERF_EVENTS_AMD_POWER=m
[root@localhost bin]# likwid-perfctr -f -V 1 -g MEM2 /home/pcadmin/stream
CPU name: AMD EPYC 9554 64-Core Processor
CPU type: AMD K19 (Zen4) architecture
CPU clock: 3.10 GHz
CPU family: 25
CPU model: 17
CPU short: zen4
CPU stepping: 1
CPU features: FP MMX SSE SSE2 HTT MMX RDTSCP MONITOR SSSE FMA SSE4.1 SSE4.2 AES AVX RDRAND AVX2 AVX512 RDSEED SSE3
CPU arch: x86_64
DEBUG - [access_client_startDaemon:157] Starting daemon /usr/local/sbin/likwid-accessD DEBUG - [access_client_startDaemon:235] Successfully opened socket /tmp/likwid-83685 to daemon for CPU 127 Executing: /home/pcadmin/stream DEBUG - [perfmon_addEventSet:2328] Currently 1 groups of 2 active DEBUG - [perfgroup_readGroup:873] Reading group MEM2 from /usr/local/share/likwid/perfgroups/zen4/MEM2.txt DEBUG - [perfmon_addEventSet:2514] Added event ACTUAL_CPU_CLOCK for counter FIXC1 to group 0 DEBUG - [perfmon_addEventSet:2514] Added event MAX_CPU_CLOCK for counter FIXC2 to group 0 DEBUG - [perfmon_addEventSet:2514] Added event RETIRED_INSTRUCTIONS for counter PMC0 to group 0 DEBUG - [perfmon_addEventSet:2514] Added event CPU_CLOCKS_UNHALTED for counter PMC1 to group 0 DEBUG - [checkAccess:237] WARNING: Counter DFC0 does not exist DEBUG - [perfmon_addEventSet:2437] Cannot access counter register DFC0 DEBUG - [checkAccess:237] WARNING: Counter DFC1 does not exist DEBUG - [perfmon_addEventSet:2437] Cannot access counter register DFC1 DEBUG - [checkAccess:237] WARNING: Counter DFC2 does not exist DEBUG - [perfmon_addEventSet:2437] Cannot access counter register DFC2 DEBUG - [checkAccess:237] WARNING: Counter DFC3 does not exist DEBUG - [perfmon_addEventSet:2437] Cannot access counter register DFC3
zen4 cpu has 12 memory channels(https://www.amd.com/en/products/cpu/amd-epyc-9554),but why likwid library only support 8 memory channels for profmon datas?
I encountered the same problem, os is Rocky linux 8.6 kernel version: 4.18.0-372.9.1.el8.x86_64 /proc/sys/kernel/perf_event_paranoid is 0 /sys/device/amd_df and /sys/device/amd_l3 has existed
[root@localhost bin]# grep -i perf_event /boot/config-4.18.0-372.9.1.el8.x86_64 CONFIG_HAVE_PERF_EVENTS=y CONFIG_PERF_EVENTS=y CONFIG_HAVE_PERF_EVENTS_NMI=y CONFIG_PERF_EVENTS_INTEL_UNCORE=m CONFIG_PERF_EVENTS_INTEL_RAPL=m CONFIG_PERF_EVENTS_INTEL_CSTATE=m CONFIG_PERF_EVENTS_AMD_POWER=m
[root@localhost bin]# likwid-perfctr -f -V 1 -g MEM2 /home/pcadmin/stream
CPU name: AMD EPYC 9554 64-Core Processor
CPU type: AMD K19 (Zen4) architecture CPU clock: 3.10 GHz CPU family: 25 CPU model: 17 CPU short: zen4 CPU stepping: 1 CPU features: FP MMX SSE SSE2 HTT MMX RDTSCP MONITOR SSSE FMA SSE4.1 SSE4.2 AES AVX RDRAND AVX2 AVX512 RDSEED SSE3 CPU arch: x86_64 DEBUG - [access_client_startDaemon:157] Starting daemon /usr/local/sbin/likwid-accessD DEBUG - [access_client_startDaemon:235] Successfully opened socket /tmp/likwid-83685 to daemon for CPU 127 Executing: /home/pcadmin/stream DEBUG - [perfmon_addEventSet:2328] Currently 1 groups of 2 active DEBUG - [perfgroup_readGroup:873] Reading group MEM2 from /usr/local/share/likwid/perfgroups/zen4/MEM2.txt DEBUG - [perfmon_addEventSet:2514] Added event ACTUAL_CPU_CLOCK for counter FIXC1 to group 0 DEBUG - [perfmon_addEventSet:2514] Added event MAX_CPU_CLOCK for counter FIXC2 to group 0 DEBUG - [perfmon_addEventSet:2514] Added event RETIRED_INSTRUCTIONS for counter PMC0 to group 0 DEBUG - [perfmon_addEventSet:2514] Added event CPU_CLOCKS_UNHALTED for counter PMC1 to group 0 DEBUG - [checkAccess:237] WARNING: Counter DFC0 does not exist DEBUG - [perfmon_addEventSet:2437] Cannot access counter register DFC0 DEBUG - [checkAccess:237] WARNING: Counter DFC1 does not exist DEBUG - [perfmon_addEventSet:2437] Cannot access counter register DFC1 DEBUG - [checkAccess:237] WARNING: Counter DFC2 does not exist DEBUG - [perfmon_addEventSet:2437] Cannot access counter register DFC2 DEBUG - [checkAccess:237] WARNING: Counter DFC3 does not exist DEBUG - [perfmon_addEventSet:2437] Cannot access counter register DFC3
I maybe find this WARNING message reason, the struct zen4_counter_map of src/include/perfmon_zen4_counters.h file,missing Index "PMC17"。
@marquis-wang Yes, you found it. I fixed it yesterday night. Please test it: https://github.com/RRZE-HPC/likwid/commit/7027aa64bf7f8af87173a8778635fad4f012dcc6
I will add additional memory channels to the branch. Yes it should be 12.
@TomTheBear Great ! I test branch amd_zen4 :44cf4ca it works well.
It works but it is not done. I did some major updates yesterday to the branch but the branch cannot be merged, so I create a new one only with the fixes.
The events currently configured in MEM1 and MEM2 do no exist for Zen4 anymore, so unclear whether they actually count memory traffic. The updated version will not have MEM1 and MEM2 anymore but MEMREAD and MEMWRITE and use the officially documented metrics for memory traffic..
I want to using likwid library to develop collect tools for our's Cluster(Zen4), the memory bandwidth data of https://github.com/RRZE-HPC/likwid/commit/7027aa64bf7f8af87173a8778635fad4f012dcc6 missing 4 memory channls。 I look at the newest commit (44cf4ca) had add full channls ,so I test it ,I compare the likwid-perfctr‘s output(MEMREAD and MEMWRITE) and stream’s output,the results is no big difference。In he officially documented (AMD PPR Family 19h),i found a new event (DATA_BW)maybe helperful moniter the memory bandwidth, I will test the event .
I'm glad that it works for you now. Please be careful with the PPRs, you have to use the one for the family & model: AMD Family 19h Model 11h should be the right one. In the third document, it documents a DATA_BW
event but it is just the in detail explanation/breakdown of the events already documented in https://github.com/RRZE-HPC/likwid/pull/618. Unfortunately, also with the details, it is impossible to perform read&write measurements in one go.
The UMC performance counters would be of interest to count at the memory controller instead of the DataFabric but they seem quite complicated to add. There is already infrastructure for MMIO based counters but some effort would be required. Unfortunately, they are never exposed by perf_event, so they can be added for accessdaemon/direct only.
I am trying to measure memory bandwidth for a stencil application that runs on both sockets of a two socket AMD 9654 system. I am getting zero as the memory bandwidth as seen below. Is there an issue with
DFC
counters on zen4 architecture? Is it fully supported? I tried with and without-f
.