bugfix: Fix `BuiltinRunner::final_stack` for `SegmentArena`

fmoletta commented 5 months ago

PR #1673 unified most implementations of final_stack into a single implementation, but overlooked a small detail that made the implementation different for SegmentArena, introducing a bug. This PR fixes this issue by accounting for SegmentArena's difference in the unified final_stack implementation.

github-actions[bot] commented 5 months ago

**Hyper Thereading Benchmark results**

hyperfine -r 2 -n "hyper_threading_main threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_main' -n "hyper_threading_pr threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 1
  Time (mean ± σ):     27.131 s ±  0.022 s    [User: 26.384 s, System: 0.745 s]
  Range (min … max):   27.115 s … 27.146 s    2 runs

Benchmark 2: hyper_threading_pr threads: 1
  Time (mean ± σ):     27.143 s ±  0.006 s    [User: 26.317 s, System: 0.824 s]
  Range (min … max):   27.138 s … 27.147 s    2 runs

Summary
  'hyper_threading_main threads: 1' ran
    1.00 ± 0.00 times faster than 'hyper_threading_pr threads: 1'

hyperfine -r 2 -n "hyper_threading_main threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_main' -n "hyper_threading_pr threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 2
  Time (mean ± σ):     14.621 s ±  0.023 s    [User: 26.954 s, System: 0.828 s]
  Range (min … max):   14.605 s … 14.637 s    2 runs

Benchmark 2: hyper_threading_pr threads: 2
  Time (mean ± σ):     14.584 s ±  0.029 s    [User: 26.838 s, System: 0.798 s]
  Range (min … max):   14.563 s … 14.604 s    2 runs

Summary
  'hyper_threading_pr threads: 2' ran
    1.00 ± 0.00 times faster than 'hyper_threading_main threads: 2'

hyperfine -r 2 -n "hyper_threading_main threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_main' -n "hyper_threading_pr threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 4
  Time (mean ± σ):     11.083 s ±  0.094 s    [User: 38.612 s, System: 0.992 s]
  Range (min … max):   11.016 s … 11.150 s    2 runs

Benchmark 2: hyper_threading_pr threads: 4
  Time (mean ± σ):     11.138 s ±  0.324 s    [User: 38.637 s, System: 0.979 s]
  Range (min … max):   10.908 s … 11.367 s    2 runs

Summary
  'hyper_threading_main threads: 4' ran
    1.00 ± 0.03 times faster than 'hyper_threading_pr threads: 4'

hyperfine -r 2 -n "hyper_threading_main threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_main' -n "hyper_threading_pr threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 6
  Time (mean ± σ):     10.646 s ±  0.063 s    [User: 39.193 s, System: 0.936 s]
  Range (min … max):   10.602 s … 10.691 s    2 runs

Benchmark 2: hyper_threading_pr threads: 6
  Time (mean ± σ):     10.687 s ±  0.096 s    [User: 38.709 s, System: 1.004 s]
  Range (min … max):   10.619 s … 10.755 s    2 runs

Summary
  'hyper_threading_main threads: 6' ran
    1.00 ± 0.01 times faster than 'hyper_threading_pr threads: 6'

hyperfine -r 2 -n "hyper_threading_main threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_main' -n "hyper_threading_pr threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 8
  Time (mean ± σ):     10.658 s ±  0.038 s    [User: 39.251 s, System: 1.015 s]
  Range (min … max):   10.631 s … 10.685 s    2 runs

Benchmark 2: hyper_threading_pr threads: 8
  Time (mean ± σ):     10.585 s ±  0.145 s    [User: 39.384 s, System: 0.952 s]
  Range (min … max):   10.483 s … 10.688 s    2 runs

Summary
  'hyper_threading_pr threads: 8' ran
    1.01 ± 0.01 times faster than 'hyper_threading_main threads: 8'

hyperfine -r 2 -n "hyper_threading_main threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_main' -n "hyper_threading_pr threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 16
  Time (mean ± σ):     10.476 s ±  0.082 s    [User: 39.713 s, System: 1.023 s]
  Range (min … max):   10.418 s … 10.533 s    2 runs

Benchmark 2: hyper_threading_pr threads: 16
  Time (mean ± σ):     10.550 s ±  0.236 s    [User: 39.450 s, System: 1.050 s]
  Range (min … max):   10.384 s … 10.717 s    2 runs

Summary
  'hyper_threading_main threads: 16' ran
    1.01 ± 0.02 times faster than 'hyper_threading_pr threads: 16'

codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 94.76%. Comparing base (5da72d9) to head (17db721).

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #1747 +/- ## ======================================= Coverage 94.76% 94.76% ======================================= Files 101 101 Lines 38804 38816 +12 ======================================= + Hits 36773 36785 +12 Misses 2031 2031 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

github-actions[bot] commented 5 months ago

Benchmark Results for unmodified programs :rocket:

Command	Mean [s]	Min [s]	Max [s]	Relative
`base big_factorial`	2.067 ± 0.089	2.014	2.290	1.02 ± 0.05
`head big_factorial`	2.030 ± 0.024	2.007	2.091	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base big_fibonacci`	1.993 ± 0.028	1.966	2.060	1.01 ± 0.02
`head big_fibonacci`	1.982 ± 0.014	1.957	2.004	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base blake2s_integration_benchmark`	7.558 ± 0.091	7.452	7.727	1.00
`head blake2s_integration_benchmark`	7.616 ± 0.175	7.448	8.070	1.01 ± 0.03

Command	Mean [s]	Min [s]	Max [s]	Relative
`base compare_arrays_200000`	2.100 ± 0.023	2.074	2.160	1.00 ± 0.01
`head compare_arrays_200000`	2.099 ± 0.018	2.078	2.135	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base dict_integration_benchmark`	1.421 ± 0.029	1.396	1.492	1.03 ± 0.02
`head dict_integration_benchmark`	1.376 ± 0.011	1.364	1.402	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base field_arithmetic_get_square_benchmark`	1.290 ± 0.008	1.280	1.304	1.00 ± 0.01
`head field_arithmetic_get_square_benchmark`	1.284 ± 0.014	1.266	1.320	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base integration_builtins`	7.622 ± 0.177	7.481	8.099	1.00 ± 0.03
`head integration_builtins`	7.618 ± 0.112	7.473	7.822	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base keccak_integration_benchmark`	8.175 ± 0.696	7.747	9.975	1.04 ± 0.09
`head keccak_integration_benchmark`	7.826 ± 0.093	7.707	7.994	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base linear_search`	2.062 ± 0.016	2.036	2.082	1.01 ± 0.01
`head linear_search`	2.051 ± 0.012	2.037	2.075	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base math_cmp_and_pow_integration_benchmark`	1.684 ± 0.007	1.676	1.696	1.00 ± 0.01
`head math_cmp_and_pow_integration_benchmark`	1.681 ± 0.021	1.661	1.732	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base math_integration_benchmark`	1.598 ± 0.013	1.591	1.634	1.01 ± 0.02
`head math_integration_benchmark`	1.582 ± 0.021	1.568	1.638	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base memory_integration_benchmark`	1.198 ± 0.037	1.172	1.280	1.01 ± 0.03
`head memory_integration_benchmark`	1.183 ± 0.017	1.172	1.230	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base operations_with_data_structures_benchmarks`	1.804 ± 0.010	1.790	1.828	1.00 ± 0.01
`head operations_with_data_structures_benchmarks`	1.796 ± 0.011	1.788	1.827	1.00

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base pedersen`	513.5 ± 4.1	510.0	524.6	1.00 ± 0.01
`head pedersen`	513.5 ± 4.4	510.2	524.5	1.00

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base poseidon_integration_benchmark`	949.9 ± 21.1	939.3	1008.5	1.01 ± 0.02
`head poseidon_integration_benchmark`	940.9 ± 3.3	937.3	946.8	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base secp_integration_benchmark`	1.854 ± 0.009	1.843	1.877	1.00
`head secp_integration_benchmark`	1.864 ± 0.010	1.847	1.880	1.01 ± 0.01

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base set_integration_benchmark`	640.3 ± 6.1	635.1	653.8	1.00 ± 0.01
`head set_integration_benchmark`	637.6 ± 2.2	633.2	640.7	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base uint256_integration_benchmark`	4.229 ± 0.065	4.147	4.343	1.00
`head uint256_integration_benchmark`	4.243 ± 0.273	4.108	5.011	1.00 ± 0.07

lambdaclass / cairo-vm

bugfix: Fix `BuiltinRunner::final_stack` for `SegmentArena` #1747

Codecov Report