gelisam / klister

an implementation of stuck macros
BSD 3-Clause "New" or "Revised" License
128 stars 11 forks source link

Slow tests are slow because of poor branch prediction #230

Open doyougnu opened 4 months ago

doyougnu commented 4 months ago

At least on my laptop, spurred by #223 I checked our old evaluator on implicit-conversion-tests:


 Performance counter stats for 'cabal test --enable-library-profiling --test-show-details=streaming --test-options=-p implicit-conversion-test --ghc-options=+RTS -hm -S -p  -RTS':

          5,476.91 msec task-clock:u                     #    0.991 CPUs utilized          
                 0      context-switches:u               #    0.000 /sec                   
                 0      cpu-migrations:u                 #    0.000 /sec                   
           215,679      page-faults:u                    #   39.380 K/sec                  
    22,608,634,065      cycles:u                         #    4.128 GHz                    
    43,699,445,726      instructions:u                   #    1.93  insn per cycle         
     9,248,234,634      branches:u                       #    1.689 G/sec                  
        92,336,621      branch-misses:u                  #    1.00% of all branches        
   109,190,586,530      slots:u                          #   19.937 G/sec                  
    37,022,159,543      topdown-retiring:u               #     29.7% Retiring              
    33,638,603,889      topdown-bad-spec:u               #     27.0% Bad Speculation       
    20,394,661,570      topdown-fe-bound:u               #     16.3% Frontend Bound        
    33,697,294,063      topdown-be-bound:u               #     27.0% Backend Bound         

       5.526535826 seconds time elapsed

       5.157156000 seconds user
       0.334127000 seconds sys

And we get only a 29.7% retirement rate (cycles that computed some instruction), why that low? Because we have 27% bad speculation and consequently 27% stalled backend cycles.