andikleen / pmu-tools

Intel PMU profiling tools
GNU General Public License v2.0
1.98k stars 331 forks source link

toplev does not handle SIB_THRESHOLD properly #447

Open aayasin opened 1 year ago

aayasin commented 1 year ago

In this example, Frontend_Bound and Backend_Bound are within 5% from each other, yet toplev flags Core_Bound with <==. It should not. In general, it should stop one level above where two siblings are too close to each other.

Default in toplev.py is: SIB_THRESH = 0.05

topdown primary, 2-levels 1 runs ..                                                                                                                                                
# 4.5-full-perf on Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz [clx/skylake]                                                                                                      
FE             Frontend_Bound                      % Slots                        24.3    [ 9.2%]                                                                                  
BAD            Bad_Speculation                     % Slots                        15.6    [ 9.1%]                                                                                  
BE             Backend_Bound                       % Slots                        25.9    [ 9.2%]                                                                                  
Info.Core      CoreIPC                               Core_Metric                   1.30   [ 9.2%]                                                                                  
Info.Inst_Mix  Instructions                          Count         2,875,066,889,010      [ 9.2%]                                                                                  
Info.Inst_Mix  IpTB                                  Inst_Metric                  16.17   [ 9.1%]                                                                                  
FE             Frontend_Bound.Fetch_Latency        % Slots                        19.9    [ 9.1%]                                                                                  
BAD            Bad_Speculation.Machine_Clears      % Slots                        10.6    [ 9.1%]                                                                                  
BE/Core        Backend_Bound.Core_Bound            % Slots                        23.1    [ 9.1%]<==                                                                               
Info.Frontend  DSB_Coverage                          Metric                        0.74   [ 9.1%]                                                                                  
Info.Thread    IPC                                   Metric                        1.30   [ 9.2%]                                                                                  
Info.System    Time                                  Seconds                     564.63                                                                                            
Info.Core      ILP                                   Core_Metric                   2.25   [ 9.2%]                                                                                  
Info.Core      CORE_CLKS                             Count         2,207,226,992,396      [ 9.2%]                                                                                  
Info.Bad_Spec  IpMispredict                          Inst_Metric               1,054.4    [ 9.1%]                                                                                  
Info.Memory    Load_Miss_Real_Latency                Clocks_Latency                 6.01   [ 9.2%]                                                                                 
MUX                                                %                               9.10                                                                                            
Frequency                                            CoreMetric                    3.96   [ 9.1%]                                                                                  

toplev.py used: PERF=../perf /usr/bin/python /media/t/HD2/workspace/ahmad_yasin_perf_tool/perf-tools/pmu-tools/toplev.py --no-desc -vl2 --nodes '+CoreIPC,+Instructions,+CORE_CLKS,+Time,-CPU_Utilization,+Load_Miss_Real_Latency,+L2MPKI,+ILP,+IpTB,+IpMispredict' -V bude-1-ht0-aslr0-vnc0.toplev-vl2-perf.csv --frequency --metric-group +Summary,+DSB -r1 -- <my app>

Attached csv file for testing bude-1-ht0-aslr0-vnc0.toplev-vl2-perf.csv

andikleen commented 11 months ago

I think I fixed this at some point. Please double check it still happens @aayasin