Open edisonchan opened 4 years ago
You can create the required performance groups yourself: https://github.com/RRZE-HPC/likwid/wiki/likwid-perfctr#defining-custom-performance-groups
I might be difficult to create the groups because they partly need more events than there are physical counters. Of course, I would be interested in integrating them in LIKWID.
Is your feature request related to a problem? Please describe. I hope there is a easy way to know the bottle-neck for the programs run on AMD CPUs.
Describe the solution you'd like from : Top-Down Characterization Approximation based on performance counters architecture for AMD processors, by Mateusz Jarus and Ariel Oleksiaka
In the Table 2 of the article: Top-level metrics Frontend Bound: DECODER_EMPTY/slots Bad Speculation: (RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS + RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED + RETIRED_INDIRECT_BRANCHES_MISPREDICTED) ∗ (Branch misprediction cost)/UNHALTED_REFERENCE_CYCLES Retiring RETIRED_UOPS/slots Backend Bound: 100% – (Frontend Bound + Bad Speculation + Retiring)
Frontend Bound subcategories: ICache Misses INSTRUCTION_FETCH_STALL/slots ITLB Misses (L1_ITLB_MISS_AND_L2_ITLB_HIT + L1_ITLB_MISS_AND_L2_ITLB_MISS) ∗ (iTLB miss cost)/UNHALTED_REFERENCE_CYCLES
Backend Bound subcategories: Memory Bound (MAB_WAIT:0/UNHALTED_REFERENCE_CYCLES) Core Bound Backend Bound – Memory Bound
Retiring subcategories: x87 instructions RETIRED_MMX_FP_INSTRUCTIONS:X87 MMX instructions RETIRED_MMX_FP_INSTRUCTIONS:MMX SSE instructions RETIRED_MMX_FP_INSTRUCTIONS:SSE CLFLUSH instructions RETIRED_CLFLUSH_INSTRUCTIONS CPUID instructions RETIRED_CPUID_INSTRUCTIONS Branch instructions RETIRED_BRANCH_INSTRUCTIONS Taken branch instructions RETIRED_TAKEN_BRANCH_INSTRUCTIONS Far control transfers RETIRED_FAR_CONTROL_TRANSFERS
Describe alternatives you've considered not yet.
Additional context Add any other context or screenshots about the feature request here.