[Cairo1] Use a cheatcode to relocate all dicts + Make temporary segment usage configurable

fmoletta commented 1 month ago

Includes commits from #1767

Add the flags segment_arena_validation & use_temporary_segments to the Cairo1HintProcessor & DictManagerExecScope respectively. These flags will determine if real segments or temporary segments will be used when creating dictionaries
DictManagerExecScope::finalize_segment no longer performs relocation and is ignored if use_temporary_segments is set to false
Add method DictManagerExecScope::relocate_all_dictionaries that adds relocation rules for all tracked dictionaries, relocating them one next to the other in a new segment.
Add cheatcode RelocateAllDictionaries to the Cairo1HintProcessor, which calls the aforementioned method
Add casm instruction to call the aforementioned cheatcode in create_entry_code if either proof_mode or append_return_values are set to true, and segment arena is present

TLDR:

Add a flag to allow choosing weather real or temporary segments will be used for dictionaries
Add a cheatcode to relocate all dictionaries by the end of the program, instead of relocating while squashing

github-actions[bot] commented 1 month ago

**Hyper Thereading Benchmark results**

hyperfine -r 2 -n "hyper_threading_main threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_main' -n "hyper_threading_pr threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 1
  Time (mean ± σ):     26.894 s ±  0.063 s    [User: 26.121 s, System: 0.771 s]
  Range (min … max):   26.849 s … 26.938 s    2 runs

Benchmark 2: hyper_threading_pr threads: 1
  Time (mean ± σ):     26.757 s ±  0.001 s    [User: 25.987 s, System: 0.768 s]
  Range (min … max):   26.756 s … 26.757 s    2 runs

Summary
  'hyper_threading_pr threads: 1' ran
    1.01 ± 0.00 times faster than 'hyper_threading_main threads: 1'

hyperfine -r 2 -n "hyper_threading_main threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_main' -n "hyper_threading_pr threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 2
  Time (mean ± σ):     15.071 s ±  0.069 s    [User: 26.918 s, System: 0.674 s]
  Range (min … max):   15.022 s … 15.120 s    2 runs

Benchmark 2: hyper_threading_pr threads: 2
  Time (mean ± σ):     14.975 s ±  0.013 s    [User: 26.807 s, System: 0.723 s]
  Range (min … max):   14.966 s … 14.983 s    2 runs

Summary
  'hyper_threading_pr threads: 2' ran
    1.01 ± 0.00 times faster than 'hyper_threading_main threads: 2'

hyperfine -r 2 -n "hyper_threading_main threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_main' -n "hyper_threading_pr threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 4
  Time (mean ± σ):     10.551 s ±  0.066 s    [User: 38.326 s, System: 0.908 s]
  Range (min … max):   10.505 s … 10.598 s    2 runs

Benchmark 2: hyper_threading_pr threads: 4
  Time (mean ± σ):     10.563 s ±  0.126 s    [User: 38.066 s, System: 0.937 s]
  Range (min … max):   10.474 s … 10.652 s    2 runs

Summary
  'hyper_threading_main threads: 4' ran
    1.00 ± 0.01 times faster than 'hyper_threading_pr threads: 4'

hyperfine -r 2 -n "hyper_threading_main threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_main' -n "hyper_threading_pr threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 6
  Time (mean ± σ):     10.699 s ±  0.093 s    [User: 38.229 s, System: 0.935 s]
  Range (min … max):   10.633 s … 10.764 s    2 runs

Benchmark 2: hyper_threading_pr threads: 6
  Time (mean ± σ):     10.593 s ±  0.238 s    [User: 38.340 s, System: 0.966 s]
  Range (min … max):   10.424 s … 10.761 s    2 runs

Summary
  'hyper_threading_pr threads: 6' ran
    1.01 ± 0.02 times faster than 'hyper_threading_main threads: 6'

hyperfine -r 2 -n "hyper_threading_main threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_main' -n "hyper_threading_pr threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 8
  Time (mean ± σ):     10.506 s ±  0.063 s    [User: 38.652 s, System: 0.874 s]
  Range (min … max):   10.461 s … 10.551 s    2 runs

Benchmark 2: hyper_threading_pr threads: 8
  Time (mean ± σ):     10.465 s ±  0.018 s    [User: 38.652 s, System: 0.911 s]
  Range (min … max):   10.452 s … 10.477 s    2 runs

Summary
  'hyper_threading_pr threads: 8' ran
    1.00 ± 0.01 times faster than 'hyper_threading_main threads: 8'

hyperfine -r 2 -n "hyper_threading_main threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_main' -n "hyper_threading_pr threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 16
  Time (mean ± σ):     10.458 s ±  0.076 s    [User: 38.941 s, System: 1.041 s]
  Range (min … max):   10.404 s … 10.512 s    2 runs

Benchmark 2: hyper_threading_pr threads: 16
  Time (mean ± σ):     10.249 s ±  0.014 s    [User: 39.000 s, System: 1.016 s]
  Range (min … max):   10.239 s … 10.259 s    2 runs

Summary
  'hyper_threading_pr threads: 16' ran
    1.02 ± 0.01 times faster than 'hyper_threading_main threads: 16'

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 85.71429% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 94.77%. Comparing base (0f4cfc2) to head (5349770).

Files	Patch %	Lines
...t_processor/cairo_1_hint_processor/dict_manager.rs	75.60%	10 Missing :warning:
...processor/cairo_1_hint_processor/hint_processor.rs	90.47%	2 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #1776 +/- ## ========================================== - Coverage 94.78% 94.77% -0.01% ========================================== Files 101 101 Lines 39011 39056 +45 ========================================== + Hits 36976 37015 +39 - Misses 2035 2041 +6 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

github-actions[bot] commented 1 month ago

Benchmark Results for unmodified programs :rocket:

Command	Mean [s]	Min [s]	Max [s]	Relative
`base big_factorial`	2.119 ± 0.022	2.089	2.157	1.00 ± 0.01
`head big_factorial`	2.118 ± 0.017	2.093	2.144	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base big_fibonacci`	2.052 ± 0.014	2.033	2.076	1.00
`head big_fibonacci`	2.071 ± 0.046	2.031	2.189	1.01 ± 0.02

Command	Mean [s]	Min [s]	Max [s]	Relative
`base blake2s_integration_benchmark`	7.829 ± 0.140	7.639	8.042	1.00
`head blake2s_integration_benchmark`	7.879 ± 0.142	7.673	8.117	1.01 ± 0.03

Command	Mean [s]	Min [s]	Max [s]	Relative
`base compare_arrays_200000`	2.203 ± 0.033	2.175	2.284	1.00
`head compare_arrays_200000`	2.233 ± 0.068	2.171	2.352	1.01 ± 0.03

Command	Mean [s]	Min [s]	Max [s]	Relative
`base dict_integration_benchmark`	1.458 ± 0.011	1.444	1.472	1.00
`head dict_integration_benchmark`	1.464 ± 0.012	1.451	1.485	1.00 ± 0.01

Command	Mean [s]	Min [s]	Max [s]	Relative
`base field_arithmetic_get_square_benchmark`	1.276 ± 0.041	1.245	1.381	1.03 ± 0.04
`head field_arithmetic_get_square_benchmark`	1.241 ± 0.014	1.226	1.263	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base integration_builtins`	7.868 ± 0.212	7.699	8.372	1.00
`head integration_builtins`	7.922 ± 0.271	7.690	8.503	1.01 ± 0.04

Command	Mean [s]	Min [s]	Max [s]	Relative
`base keccak_integration_benchmark`	8.358 ± 0.419	8.009	9.461	1.03 ± 0.06
`head keccak_integration_benchmark`	8.093 ± 0.158	7.894	8.271	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base linear_search`	2.149 ± 0.020	2.123	2.182	1.00 ± 0.01
`head linear_search`	2.149 ± 0.023	2.119	2.191	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base math_cmp_and_pow_integration_benchmark`	1.491 ± 0.018	1.469	1.533	1.00
`head math_cmp_and_pow_integration_benchmark`	1.497 ± 0.021	1.477	1.543	1.00 ± 0.02

Command	Mean [s]	Min [s]	Max [s]	Relative
`base math_integration_benchmark`	1.482 ± 0.033	1.461	1.571	1.01 ± 0.02
`head math_integration_benchmark`	1.472 ± 0.008	1.461	1.485	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base memory_integration_benchmark`	1.229 ± 0.009	1.215	1.248	1.00
`head memory_integration_benchmark`	1.230 ± 0.009	1.220	1.245	1.00 ± 0.01

Command	Mean [s]	Min [s]	Max [s]	Relative
`base operations_with_data_structures_benchmarks`	1.566 ± 0.009	1.555	1.578	1.00
`head operations_with_data_structures_benchmarks`	1.595 ± 0.022	1.566	1.643	1.02 ± 0.02

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base pedersen`	524.6 ± 4.1	519.6	533.8	1.00
`head pedersen`	525.7 ± 4.9	520.8	538.3	1.00 ± 0.01

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base poseidon_integration_benchmark`	766.4 ± 4.9	759.6	773.5	1.00
`head poseidon_integration_benchmark`	769.5 ± 17.1	757.3	816.9	1.00 ± 0.02

Command	Mean [s]	Min [s]	Max [s]	Relative
`base secp_integration_benchmark`	1.855 ± 0.025	1.825	1.901	1.00 ± 0.02
`head secp_integration_benchmark`	1.854 ± 0.018	1.833	1.886	1.00

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base set_integration_benchmark`	677.1 ± 6.7	670.2	692.6	1.00 ± 0.01
`head set_integration_benchmark`	676.6 ± 7.2	669.7	692.0	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base uint256_integration_benchmark`	4.412 ± 0.053	4.330	4.480	1.01 ± 0.02
`head uint256_integration_benchmark`	4.370 ± 0.058	4.296	4.456	1.00

lambdaclass / cairo-vm

[Cairo1] Use a cheatcode to relocate all dicts + Make temporary segment usage configurable #1776

Codecov Report