markhun / 2023W-EFFPROG

1 stars 0 forks source link

Bisect labeling searchspace #6

Closed markhun closed 6 months ago

markhun commented 6 months ago

Benchmarked locally.

Before:

40 solution(s), 15808871 leafs visited

 Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20' (5 runs):

         65.844,55 msec task-clock:u                     #    1,000 CPUs utilized               ( +-  0,30% )
                 0      context-switches:u               #    0,000 /sec                      
                 0      cpu-migrations:u                 #    0,000 /sec                      
                77      page-faults:u                    #    1,169 /sec                        ( +-  0,26% )
   245.103.603.720      cycles:u                         #    3,722 GHz                         ( +-  0,15% )  (62,49%)
   675.151.085.846      instructions:u                   #    2,75  insn per cycle              ( +-  0,01% )  (75,00%)
   162.112.116.306      branches:u                       #    2,462 G/sec                       ( +-  0,00% )  (74,99%)
     1.629.041.536      branch-misses:u                  #    1,00% of all branches             ( +-  0,04% )  (75,00%)
   176.607.507.442      L1-dcache-loads:u                #    2,682 G/sec                       ( +-  0,01% )  (75,00%)
         8.536.563      L1-dcache-load-misses:u          #    0,00% of all L1-dcache accesses   ( +-  6,88% )  (75,01%)
         1.171.455      LLC-loads:u                      #   17,791 K/sec                       ( +-  9,45% )  (50,00%)
           225.073      LLC-load-misses:u                #   19,21% of all L1-icache accesses   ( +- 10,73% )  (50,00%)

            65,863 +- 0,194 seconds time elapsed  ( +-  0,29% )

After:

40 solution(s), 4930863 leafs visited

 Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20' (5 runs):

         50.395,53 msec task-clock:u                     #    1,000 CPUs utilized               ( +-  0,42% )
                 0      context-switches:u               #    0,000 /sec                      
                 0      cpu-migrations:u                 #    0,000 /sec                      
                77      page-faults:u                    #    1,528 /sec                        ( +-  0,49% )
   186.660.184.165      cycles:u                         #    3,704 GHz                         ( +-  0,15% )  (62,50%)
   538.175.654.709      instructions:u                   #    2,88  insn per cycle              ( +-  0,01% )  (75,00%)
   129.670.542.585      branches:u                       #    2,573 G/sec                       ( +-  0,01% )  (75,01%)
     1.308.051.223      branch-misses:u                  #    1,01% of all branches             ( +-  0,06% )  (75,00%)
   134.178.260.527      L1-dcache-loads:u                #    2,663 G/sec                       ( +-  0,01% )  (75,01%)
        11.572.854      L1-dcache-load-misses:u          #    0,01% of all L1-dcache accesses   ( +-  6,70% )  (75,00%)
         1.570.867      LLC-loads:u                      #   31,171 K/sec                       ( +- 11,42% )  (49,99%)
           262.037      LLC-load-misses:u                #   16,68% of all L1-icache accesses   ( +-  5,82% )  (49,99%)

            50,407 +- 0,210 seconds time elapsed  ( +-  0,42% )
markhun commented 6 months ago

Rebased on top of main (spiral access pattern) and benchmarked locally again:

Before adding bisection of search space

40 solution(s), 2270926 leafs visited

 Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20' (5 runs):

          7.844,51 msec task-clock:u                     #    1,000 CPUs utilized               ( +-  1,13% )
                 0      context-switches:u               #    0,000 /sec                      
                 0      cpu-migrations:u                 #    0,000 /sec                      
                77      page-faults:u                    #    9,816 /sec                        ( +-  0,49% )
    25.108.781.218      cycles:u                         #    3,201 GHz                         ( +-  0,30% )  (62,45%)
    66.173.433.794      instructions:u                   #    2,64  insn per cycle              ( +-  0,03% )  (74,96%)
    14.245.497.376      branches:u                       #    1,816 G/sec                       ( +-  0,05% )  (74,98%)
       125.255.986      branch-misses:u                  #    0,88% of all branches             ( +-  0,08% )  (75,01%)
    18.529.208.063      L1-dcache-loads:u                #    2,362 G/sec                       ( +-  0,02% )  (75,04%)
         3.575.200      L1-dcache-load-misses:u          #    0,02% of all L1-dcache accesses   ( +-  3,80% )  (75,04%)
           397.838      LLC-loads:u                      #   50,715 K/sec                       ( +-  7,62% )  (49,98%)
           111.003      LLC-load-misses:u                #   27,90% of all L1-icache accesses   ( +- 10,99% )  (49,95%)

            7,8474 +- 0,0880 seconds time elapsed  ( +-  1,12% )

After adding the bisection

40 solution(s), 397928 leafs visited

 Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20' (5 runs):

          4.992,97 msec task-clock:u                     #    0,999 CPUs utilized               ( +-  0,48% )
                 0      context-switches:u               #    0,000 /sec                      
                 0      cpu-migrations:u                 #    0,000 /sec                      
                75      page-faults:u                    #   15,021 /sec                        ( +-  0,73% )
    15.923.544.738      cycles:u                         #    3,189 GHz                         ( +-  0,35% )  (62,24%)
    40.075.257.183      instructions:u                   #    2,52  insn per cycle              ( +-  0,03% )  (74,83%)
     8.600.157.232      branches:u                       #    1,722 G/sec                       ( +-  0,03% )  (75,01%)
        82.663.397      branch-misses:u                  #    0,96% of all branches             ( +-  0,05% )  (75,16%)
    11.154.589.310      L1-dcache-loads:u                #    2,234 G/sec                       ( +-  0,03% )  (75,11%)
         1.833.140      L1-dcache-load-misses:u          #    0,02% of all L1-dcache accesses   ( +-  5,61% )  (75,06%)
           292.130      LLC-loads:u                      #   58,508 K/sec                       ( +-  3,86% )  (49,88%)
            88.981      LLC-load-misses:u                #   30,46% of all L1-icache accesses   ( +-  1,93% )  (49,77%)

            4,9991 +- 0,0215 seconds time elapsed  ( +-  0,43% )