icl-utk-edu / papi

Other
100 stars 43 forks source link

large number of unsupported papi counters on 13th Gen Intel Core i7-13800H #131

Closed TomMelt closed 9 months ago

TomMelt commented 9 months ago

I want to use papi on my laptop for profiling but I can only see a small number of papi counters (and to get that I had to install the latest version v7)

Initially I tried v6 but it doesnt work for me at all. I moved to version 7 and here is what I get:

$ ./papi_avail -a
Available PAPI preset and user defined events plus hardware information.
--------------------------------------------------------------------------------
PAPI version             : 7.0.1.0
Operating system         : Linux 6.2.0-37-generic
Vendor string and code   : GenuineIntel (1, 0x1)
Model string and code    : 13th Gen Intel(R) Core(TM) i7-13800H (186, 0xba)
CPU revision             : 2.000000
CPUID                    : Family/Model/Stepping 6/186/2, 0x06/0xba/0x02
CPU Max MHz              : 5000
CPU Min MHz              : 400
Total cores              : 14
SMT threads per core     : 1
Cores per socket         : 14
Sockets                  : 1
Cores per NUMA region    : 14
NUMA regions             : 1
Running in a VM          : no
Number Hardware Counters : 0
Max Multiplex Counters   : 384
Fast counter read (rdpmc): yes
--------------------------------------------------------------------------------

================================================================================
  PAPI Preset Events
================================================================================
    Name        Code    Deriv Description (Note)

--------------------------------------------------------------------------------
Of 0 available events, 0 are derived.

No events detected!  Check papi_component_avail to find out why.

If I run papi_component_avail I get:

$ ./papi_component_avail
Available components and hardware information.
--------------------------------------------------------------------------------
PAPI version             : 7.0.1.0
Operating system         : Linux 6.2.0-37-generic
Vendor string and code   : GenuineIntel (1, 0x1)
Model string and code    : 13th Gen Intel(R) Core(TM) i7-13800H (186, 0xba)
CPU revision             : 2.000000
CPUID                    : Family/Model/Stepping 6/186/2, 0x06/0xba/0x02
CPU Max MHz              : 5000
CPU Min MHz              : 400
Total cores              : 14
SMT threads per core     : 1
Cores per socket         : 14
Sockets                  : 1
Cores per NUMA region    : 14
NUMA regions             : 1
Running in a VM          : no
Number Hardware Counters : 0
Max Multiplex Counters   : 384
Fast counter read (rdpmc): yes
--------------------------------------------------------------------------------

Compiled-in components:
Name:   perf_event              Linux perf_event CPU counters
   \-> Disabled: Error libpfm4 no default PMU found
Name:   perf_event_uncore       Linux perf_event CPU uncore and northbridge
   \-> Disabled: No uncore PMUs or events found
Name:   sysdetect               System info detection component

Active components:
Name:   sysdetect               System info detection component
                                Native: 0, Preset: 0, Counters: 0

so finally if I set LIBPFM_FORCE_PMU :

$ LIBPFM_FORCE_PMU=amd64 ./papi_component_avail 
Available components and hardware information.
--------------------------------------------------------------------------------
PAPI version             : 7.0.1.0
Operating system         : Linux 6.2.0-37-generic
Vendor string and code   : GenuineIntel (1, 0x1)
Model string and code    : 13th Gen Intel(R) Core(TM) i7-13800H (186, 0xba)
CPU revision             : 2.000000
CPUID                    : Family/Model/Stepping 6/186/2, 0x06/0xba/0x02
CPU Max MHz              : 5000
CPU Min MHz              : 400
Total cores              : 14
SMT threads per core     : 1
Cores per socket         : 14
Sockets                  : 1
Cores per NUMA region    : 14
NUMA regions             : 1
Running in a VM          : no
Number Hardware Counters : 4
Max Multiplex Counters   : 384
Fast counter read (rdpmc): yes
--------------------------------------------------------------------------------

Compiled-in components:
Name:   perf_event              Linux perf_event CPU counters
Name:   perf_event_uncore       Linux perf_event CPU uncore and northbridge
   \-> Disabled: No uncore PMUs or events found
Name:   sysdetect               System info detection component

Active components:
Name:   perf_event              Linux perf_event CPU counters
                                Native: 24, Preset: 18, Counters: 4
                                PMUs supported: amd64_k7

Name:   sysdetect               System info detection component
                                Native: 0, Preset: 0, Counters: 0

Now running papi_avail, I get:

$ LIBPFM_FORCE_PMU=amd64 ./papi_avail -a
Available PAPI preset and user defined events plus hardware information.
--------------------------------------------------------------------------------
PAPI version             : 7.0.1.0
Operating system         : Linux 6.2.0-37-generic
Vendor string and code   : GenuineIntel (1, 0x1)
Model string and code    : 13th Gen Intel(R) Core(TM) i7-13800H (186, 0xba)
CPU revision             : 2.000000
CPUID                    : Family/Model/Stepping 6/186/2, 0x06/0xba/0x02
CPU Max MHz              : 5000
CPU Min MHz              : 400
Total cores              : 14
SMT threads per core     : 1
Cores per socket         : 14
Sockets                  : 1
Cores per NUMA region    : 14
NUMA regions             : 1
Running in a VM          : no
Number Hardware Counters : 4
Max Multiplex Counters   : 384
Fast counter read (rdpmc): yes
--------------------------------------------------------------------------------

================================================================================
  PAPI Preset Events
================================================================================
    Name        Code    Deriv Description (Note)
PAPI_L1_DCM  0x80000000  No   Level 1 data cache misses
PAPI_L1_ICM  0x80000001  No   Level 1 instruction cache misses
PAPI_L1_TCM  0x80000006  Yes  Level 1 cache misses
PAPI_TLB_DM  0x80000014  No   Data translation lookaside buffer misses
PAPI_TLB_IM  0x80000015  No   Instruction translation lookaside buffer misses
PAPI_TLB_TL  0x80000016  Yes  Total translation lookaside buffer misses
PAPI_HW_INT  0x80000029  No   Hardware interrupts
PAPI_BR_TKN  0x8000002c  No   Conditional branch instructions taken
PAPI_BR_MSP  0x8000002e  No   Conditional branch instructions mispredicted
PAPI_TOT_INS 0x80000032  No   Instructions completed
PAPI_BR_INS  0x80000037  No   Branch instructions
PAPI_TOT_CYC 0x8000003b  No   Total cycles
PAPI_L1_DCH  0x8000003e  Yes  Level 1 data cache hits
PAPI_L1_DCA  0x80000040  No   Level 1 data cache accesses
PAPI_L1_ICA  0x8000004c  No   Level 1 instruction cache accesses
PAPI_L1_ICR  0x8000004f  No   Level 1 instruction cache reads
PAPI_L1_TCH  0x80000055  Yes  Level 1 total cache hits
PAPI_L1_TCA  0x80000058  Yes  Level 1 total cache accesses
--------------------------------------------------------------------------------
Of 18 available events, 5 are derived.

But I want to get PAPI_FP_INS for example and I can only 18 out of about 108 possible events.

Can anyone explain how I can find out where to get options for LIBPFM_FORCE_PMU? I got this off stackoverflow but I don't know how to find it out myself.

Is it possible to configure PAPI without needing to export LIBPFM_FORCE_PMU?

Is my intel i7 currently not fully supported, or am I doing something wrong? e.g. missing configure options.

(FYI I have set /proc/sys/kernel/perf_event_paranoid equal to 0 and I have sudo access on my laptop if necessary -- although commands above were run as user)

TomMelt commented 9 months ago

This could be related to #126

TomMelt commented 9 months ago

Looks like my CPU is raptor lake and perhaps it's not supported yet. I couldn't find anything in the repo's documentation about raptor.

If I am correct and it's just not added yet, let me know if I can help to add it in :+1:

gcongiu commented 9 months ago

@TomMelt this is a two step process. First, you will need to ask the libpfm4 maintainer to add support for the raptor lake PMU. You can also submit a patch yourself, if you are willing to contribute. This will make the raptor lake PMU visible to PAPI and, consequently, all the hardware events supported by it. Second, PAPI preset events should be added for raptor lake. If you are willing to contribute here as well @adanalis can give you instructions on how to do that.

TomMelt commented 9 months ago

I have raised an issue for libpfm4. If that goes ahead I will reopen this issue.

junyongheo commented 6 months ago

@gcongiu @adanalis I would be interested in contributing PAPI preset events for raptor lake, could you point me to the instructions for doing so?

adanalis commented 6 months ago

Thanks for offering to help. As a first step, can you send me the output of hwloc-ls?

Thanks, Anthony

On Thu, Mar 7, 2024, 12:32 AM Junyong Heo @.***> wrote:

@gcongiu https://github.com/gcongiu @adanalis https://github.com/adanalis I would be interested in contributing PAPI preset events for raptor lake, could you point me to the instructions for doing so?

— Reply to this email directly, view it on GitHub https://github.com/icl-utk-edu/papi/issues/131#issuecomment-1982396048, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK34VWJRAXFG4EZWESFC3YW73WJAVCNFSM6AAAAABAIAFWNCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBSGM4TMMBUHA . You are receiving this because you were mentioned.Message ID: @.***>

junyongheo commented 6 months ago

Dear Anthony,

Here you go:

Machine (31GB total) Package L#0 NUMANode L#0 (P#0 31GB) L3 L#0 (36MB) L2 L#0 (2048KB) + L1d L#0 (48KB) + L1i L#0 (32KB) + Core L#0 PU L#0 (P#0) PU L#1 (P#1) L2 L#1 (2048KB) + L1d L#1 (48KB) + L1i L#1 (32KB) + Core L#1 PU L#2 (P#2) PU L#3 (P#3) L2 L#2 (2048KB) + L1d L#2 (48KB) + L1i L#2 (32KB) + Core L#2 PU L#4 (P#4) PU L#5 (P#5) L2 L#3 (2048KB) + L1d L#3 (48KB) + L1i L#3 (32KB) + Core L#3 PU L#6 (P#6) PU L#7 (P#7) L2 L#4 (2048KB) + L1d L#4 (48KB) + L1i L#4 (32KB) + Core L#4 PU L#8 (P#8) PU L#9 (P#9) L2 L#5 (2048KB) + L1d L#5 (48KB) + L1i L#5 (32KB) + Core L#5 PU L#10 (P#10) PU L#11 (P#11) L2 L#6 (2048KB) + L1d L#6 (48KB) + L1i L#6 (32KB) + Core L#6 PU L#12 (P#12) PU L#13 (P#13) L2 L#7 (2048KB) + L1d L#7 (48KB) + L1i L#7 (32KB) + Core L#7 PU L#14 (P#14) PU L#15 (P#15) L2 L#8 (4096KB) L1d L#8 (32KB) + L1i L#8 (64KB) + Core L#8 + PU L#16 (P#16) L1d L#9 (32KB) + L1i L#9 (64KB) + Core L#9 + PU L#17 (P#17) L1d L#10 (32KB) + L1i L#10 (64KB) + Core L#10 + PU L#18 (P#18) L1d L#11 (32KB) + L1i L#11 (64KB) + Core L#11 + PU L#19 (P#19) L2 L#9 (4096KB) L1d L#12 (32KB) + L1i L#12 (64KB) + Core L#12 + PU L#20 (P#20) L1d L#13 (32KB) + L1i L#13 (64KB) + Core L#13 + PU L#21 (P#21) L1d L#14 (32KB) + L1i L#14 (64KB) + Core L#14 + PU L#22 (P#22) L1d L#15 (32KB) + L1i L#15 (64KB) + Core L#15 + PU L#23 (P#23) L2 L#10 (4096KB) L1d L#16 (32KB) + L1i L#16 (64KB) + Core L#16 + PU L#24 (P#24) L1d L#17 (32KB) + L1i L#17 (64KB) + Core L#17 + PU L#25 (P#25) L1d L#18 (32KB) + L1i L#18 (64KB) + Core L#18 + PU L#26 (P#26) L1d L#19 (32KB) + L1i L#19 (64KB) + Core L#19 + PU L#27 (P#27) L2 L#11 (4096KB) L1d L#20 (32KB) + L1i L#20 (64KB) + Core L#20 + PU L#28 (P#28) L1d L#21 (32KB) + L1i L#21 (64KB) + Core L#21 + PU L#29 (P#29) L1d L#22 (32KB) + L1i L#22 (64KB) + Core L#22 + PU L#30 (P#30) L1d L#23 (32KB) + L1i L#23 (64KB) + Core L#23 + PU L#31 (P#31) HostBridge PCI 00:02.0 (VGA) PCIBridge PCI 01:00.0 (NVMExp) Block(Disk) "nvme0n1" PCI 00:14.3 (Network) Net "wlo1" PCI 00:17.0 (SATA) PCIBridge PCI 02:00.0 (Ethernet) Net "enp2s0" PCIBridge PCI 03:00.0 (SATA)

Sincerely, Junyong Heo

On Mar 7, 2024, at 11:26 PM, Anthony Danalis @.***> wrote:

Thanks for offering to help. As a first step, can you send me the output of hwloc-ls?

Thanks, Anthony

On Thu, Mar 7, 2024, 12:32 AM Junyong Heo @.***> wrote:

@gcongiu https://github.com/gcongiu @adanalis https://github.com/adanalis I would be interested in contributing PAPI preset events for raptor lake, could you point me to the instructions for doing so?

— Reply to this email directly, view it on GitHub https://github.com/icl-utk-edu/papi/issues/131#issuecomment-1982396048, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK34VWJRAXFG4EZWESFC3YW73WJAVCNFSM6AAAAABAIAFWNCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBSGM4TMMBUHA . You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/icl-utk-edu/papi/issues/131#issuecomment-1983617624, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZ3N3GU3CHCDZB2Q7VMC3M3YXB2JTAVCNFSM6AAAAABAIAFWNCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBTGYYTONRSGQ. You are receiving this because you commented.