RRZE-HPC / likwid

Performance monitoring and benchmarking suite
https://hpc.fau.de/research/tools/likwid/
GNU General Public License v3.0
1.66k stars 226 forks source link

[BUG] perfctr does not detect Cortex A53 correctly in ARMv7 mode #483

Closed OoJJBoO closed 2 years ago

OoJJBoO commented 2 years ago

Issue

I noticed that perfctr does not seem to work currently when run on a Raspberry Pi 3 Model B+ even though the Cortex A53 processor (running in ARMv7 mode) should be supported as far as I understand the documentation. Running likwid-perfctr -e results in:

ERROR - [./src/perfmon.c:perfmon_init_maps:1322] Unsupported ARMv7 Processor
ERROR - [./src/perfmon.c:perfmon_check_counter_map:746] Counter and event maps not initialized.
This architecture has 0 counters.
Counter tags(name, type<, options>):

This architecture has 0 events.
Event tags (tag, id, umask, counters<, options>):

Also, likwid-topology lists the processor model as nil:

--------------------------------------------------------------------------------
CPU name:       ARMv7 Processor rev 4 (v7l)
CPU type:       nil
CPU stepping:   4
********************************************************************************
Hardware Thread Topology
********************************************************************************
Sockets:                1
Cores per socket:       4
Threads per core:       1
--------------------------------------------------------------------------------
HWThread        Thread        Core        Die        Socket        Available
0               0             0           0          0             *                
1               0             1           0          0             *                
2               0             2           0          0             *                
3               0             3           0          0             *                
--------------------------------------------------------------------------------
Socket 0:               ( 0 1 2 3 )
--------------------------------------------------------------------------------
********************************************************************************
Cache Topology
********************************************************************************
********************************************************************************
NUMA Topology
********************************************************************************
NUMA domains:           1
--------------------------------------------------------------------------------
Domain:                 0
Processors:             ( 0 1 2 3 )
Distances:              10
Free memory:            194.613 MB
Total memory:           925.41 MB
--------------------------------------------------------------------------------

Potential Fix

While looking a bit around the source, I found that in topology.c and perfmon.c the switches in lines 1137 and 1287 respectively both use cpuid_info.model which in my case is 0 and not cpuid_info.part which looks to contain the correct value defined in ARM_CORTEX_A53. When building after changing this, likwid-topology now outputs the correct cpu model:

--------------------------------------------------------------------------------
CPU name:       ARMv7 Processor rev 4 (v7l)
CPU type:       ARM Cortex A53
CPU stepping:   4
********************************************************************************
Hardware Thread Topology
********************************************************************************
Sockets:                1
Cores per socket:       4
Threads per core:       1
--------------------------------------------------------------------------------
HWThread        Thread        Core        Die        Socket        Available
0               0             0           0          0             *                
1               0             1           0          0             *                
2               0             2           0          0             *                
3               0             3           0          0             *                
--------------------------------------------------------------------------------
Socket 0:               ( 0 1 2 3 )
--------------------------------------------------------------------------------
********************************************************************************
Cache Topology
********************************************************************************
********************************************************************************
NUMA Topology
********************************************************************************
NUMA domains:           1
--------------------------------------------------------------------------------
Domain:                 0
Processors:             ( 0 1 2 3 )
Distances:              10
Free memory:            220.812 MB
Total memory:           925.41 MB
--------------------------------------------------------------------------------

The output of likwid-perfctr -e changes to the following:

This architecture has 0 counters.
Counter tags(name, type<, options>):

This architecture has 0 events.
Event tags (tag, id, umask, counters<, options>):

Still, no counters were detected. Then I noticed that the path that is set for the PMC for the Cortex A53 in includes/perfmon_a57_counters.h, namely /sys/bus/event_source/devices/cpu, does not exist on my machine. After changing it to the only fitting sub-directory that is available on my end, /sys/bus/event_source/devices/armv7_cortex_a7, and rebuilding, perfctr now looks to be working:

This architecture has 6 counters.
Counter tags(name, type<, options>):
PMC0, Core-local general purpose counters
PMC1, Core-local general purpose counters
PMC2, Core-local general purpose counters
PMC3, Core-local general purpose counters
PMC4, Core-local general purpose counters
PMC5, Core-local general purpose counters

This architecture has 84 events.
Event tags (tag, id, umask, counters<, options>):
SW_INCR, 0x0, 0x0, PMC
L1I_CACHE_REFILL, 0x1, 0x0, PMC
L1I_TLB_REFILL, 0x2, 0x0, PMC
L1D_CACHE_REFILL, 0x3, 0x0, PMC
L1D_CACHE, 0x4, 0x0, PMC
L1D_TLB_REFILL, 0x5, 0x0, PMC
INST_RETIRED, 0x8, 0x0, PMC
EXC_TAKEN, 0x9, 0x0, PMC
EXC_RETURN, 0xA, 0x0, PMC
CID_WRITE_RETIRED, 0xB, 0x0, PMC
BR_MIS_PRED, 0x10, 0x0, PMC
CPU_CYCLES, 0x11, 0x0, PMC
BR_PRED, 0x12, 0x0, PMC
MEM_ACCESS, 0x13, 0x0, PMC
L1I_CACHE, 0x14, 0x0, PMC
L1D_CACHE_WB, 0x15, 0x0, PMC
L2D_CACHE, 0x16, 0x0, PMC
L2D_CACHE_REFILL, 0x17, 0x0, PMC
L2D_CACHE_WB, 0x18, 0x0, PMC
BUS_ACCESS, 0x19, 0x0, PMC
MEMORY_ERROR, 0x1A, 0x0, PMC
INST_SPEC, 0x1B, 0x0, PMC
TTBR_WRITE_RETIRED, 0x1C, 0x0, PMC
BUS_CYCLES, 0x1D, 0x0, PMC
CHAIN, 0x1E, 0x0, PMC
L1D_CACHE_LD, 0x40, 0x0, PMC
L1D_CACHE_ST, 0x41, 0x0, PMC
L1D_CACHE_REFILL_LD, 0x42, 0x0, PMC
L1D_CACHE_REFILL_ST, 0x43, 0x0, PMC
L1D_CACHE_WB_VICTIM, 0x46, 0x0, PMC
L1D_CACHE_WB_CLEAN, 0x47, 0x0, PMC
L1D_CACHE_INVAL, 0x48, 0x0, PMC
L1D_TLB_REFILL_LD, 0x4C, 0x0, PMC
L1D_TLB_REFILL_ST, 0x4D, 0x0, PMC
L2D_CACHE_LD, 0x50, 0x0, PMC
L2D_CACHE_ST, 0x51, 0x0, PMC
L2D_CACHE_REFILL_LD, 0x52, 0x0, PMC
L2D_CACHE_REFILL_ST, 0x53, 0x0, PMC
L2D_CACHE_WB_VICTIM, 0x56, 0x0, PMC
L2D_CACHE_WB_CLEAN, 0x57, 0x0, PMC
L2D_CACHE_INVAL, 0x58, 0x0, PMC
BUS_ACCESS_LD, 0x60, 0x0, PMC
BUS_ACCESS_ST, 0x61, 0x0, PMC
BUS_ACCESS_SHARED, 0x62, 0x0, PMC
BUS_ACCESS_NOT_SHARED, 0x63, 0x0, PMC
BUS_ACCESS_NORMAL, 0x64, 0x0, PMC
BUS_ACCESS_PERIPH, 0x65, 0x0, PMC
MEM_ACCESS_LD, 0x66, 0x0, PMC
MEM_ACCESS_ST, 0x67, 0x0, PMC
UNALIGNED_LD_SPEC, 0x68, 0x0, PMC
UNALIGNED_ST_SPEC, 0x69, 0x0, PMC
UNALIGNED_LDST_SPEC, 0x6A, 0x0, PMC
LDREX_SPEC, 0x6C, 0x0, PMC
STREX_PASS_SPEC, 0x6D, 0x0, PMC
STREX_FAIL_SPEC, 0x6E, 0x0, PMC
LD_SPEC, 0x70, 0x0, PMC
ST_SPEC, 0x71, 0x0, PMC
LDST_SPEC, 0x72, 0x0, PMC
DP_SPEC, 0x73, 0x0, PMC
ASE_SPEC, 0x74, 0x0, PMC
VFP_SPEC, 0x75, 0x0, PMC
PC_WRITE_SPEC, 0x76, 0x0, PMC
CRYPTO_SPEC, 0x77, 0x0, PMC
BR_IMMED_SPEC, 0x78, 0x0, PMC
BR_RETURN_SPEC, 0x79, 0x0, PMC
BR_INDIRECT_SPEC, 0x7A, 0x0, PMC
ISB_SPEC, 0x7C, 0x0, PMC
DSB_SPEC, 0x7D, 0x0, PMC
DMB_SPEC, 0x7E, 0x0, PMC
EXC_UNDEF, 0x81, 0x0, PMC
EXC_SVC, 0x82, 0x0, PMC
EXC_PABORT, 0x83, 0x0, PMC
EXC_DABORT, 0x84, 0x0, PMC
EXC_IRQ, 0x86, 0x0, PMC
EXC_FIQ, 0x87, 0x0, PMC
EXC_SMC, 0x88, 0x0, PMC
EXC_HVC, 0x8A, 0x0, PMC
EXC_TRAP_PABORT, 0x8B, 0x0, PMC
EXC_TRAP_DABORT, 0x8C, 0x0, PMC
EXC_TRAP_OTHER, 0x8D, 0x0, PMC
EXC_TRAP_IRQ, 0x8E, 0x0, PMC
EXC_TRAP_FIQ, 0x8F, 0x0, PMC
RC_LD_SPEC, 0x90, 0x0, PMC
RC_ST_SPEC, 0x91, 0x0, PMC
OoJJBoO commented 2 years ago

I just created a draft PR (#484) implementing the potential fix I described to make it a bit easier to see what I mean.

TomTheBear commented 2 years ago

See comments in #484