Closed chrisiKern closed 9 months ago
Previous result:
40 solution(s), 15808871 leafs visited
Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20':
239324431287 cycles:u
783347725558 instructions:u # 3.27 insn per cycle
223306441639 branches:u
1975057745 branch-misses:u # 0.88% of all branches
115449 L1-dcache-load-misses
51.166236323 seconds time elapsed
51.027572000 seconds user
0.000000000 seconds sys
New result:
40 solution(s), 14286898 leafs visited
Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20':
190321993957 cycles:u
610965468734 instructions:u # 3.21 insn per cycle
173402836494 branches:u
1399145179 branch-misses:u # 0.81% of all branches
56907 L1-dcache-load-misses
40.739878446 seconds time elapsed
40.587102000 seconds user
0.000000000 seconds sys
This corresponds to a speedup of ~1.26.
@markhun The tests seem to fail, but I believe this is simply due to the fact that the solutions are printed in a different order.
Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20':
34818444413 cycles:u
93923335509 instructions:u # 2.70 insn per cycle
26642066759 branches:u
451358330 branch-misses:u # 1.69% of all branches
78515 L1-dcache-load-misses
7.426998087 seconds time elapsed
7.427000000 seconds user
0.000000000 seconds sys
This corresponds to a speedup of 6.89.
Pre-generating the order instead of computing the values on the go improved the running time even more.
40 solution(s), 2720000 leafs visited
Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20':
32006826745 cycles:u
93908462572 instructions:u # 2.93 insn per cycle
26639092246 branches:u
210028006 branch-misses:u # 0.79% of all branches
52606 L1-dcache-load-misses
6.827565039 seconds time elapsed
6.827594000 seconds user
0.000000000 seconds sys
40 solution(s), 40307012 leafs visited
Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20':
569338530278 cycles:u
1592037752274 instructions:u # 2.80 insn per cycle
450239292852 branches:u
3980993285 branch-misses:u # 0.88% of all branches
173202 L1-dcache-load-misses
121.511262278 seconds time elapsed
121.348029000 seconds user
0.003999000 seconds sys
Performance was clearly quite bad, even worse than going line by line. I believe this is because we restrict ourselves to a set of lines (the spokes) and therefore do not cover enough ground during the branching.
40 solution(s), 2270926 leafs visited
Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20':
29124691514 cycles:u
88695799255 instructions:u # 3.05 insn per cycle
25315321219 branches:u
172645169 branch-misses:u # 0.68% of all branches
87546 L1-dcache-load-misses
6.212748436 seconds time elapsed
6.212758000 seconds user
0.000000000 seconds sys
This corresponds to a speedup of 9.85, outperforming the regular spiral order.
Merging main
into our branch and additionally storing the order as a global variable yielded some further improvements.
40 solution(s), 2270926 leafs visited
Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20':
20474296368 cycles:u
64568801434 instructions:u # 3.15 insn per cycle
14233207493 branches:u
105657822 branch-misses:u # 0.74% of all branches
179975 L1-dcache-load-misses
4.191709594 seconds time elapsed
4.191651000 seconds user
0.000000000 seconds sys
Since the cache misses were quite high in comparison, in #8 I reintroduced the order as a parameter of labeling, yielding slightly better results.
40 solution(s), 2270926 leafs visited
Performance counter stats for './bin/magichex 4 3 14 33 30 34 39 6 24 20':
19540445479 cycles:u
64569867459 instructions:u # 3.30 insn per cycle
14233207435 branches:u
104955172 branch-misses:u # 0.74% of all branches
176459 L1-dcache-load-misses
4.001851060 seconds time elapsed
4.001849000 seconds user
0.000000000 seconds sys
This only slightly reduced the number of cache misses, but the running time is visibly better.
Implement different exploration orders inside the
labeling
function, as this is hinted to increase performance significantly.Possible heuristics for the exploration order: