lambdaclass / cairo-vm

cairo-vm is a Rust implementation of the Cairo VM. Cairo (CPU Algebraic Intermediate Representation) is a programming language for writing provable programs, where one party can prove to another that a certain computation was executed correctly without the need for this party to re-execute the same program.
https://lambdaclass.github.io/cairo-vm
Apache License 2.0
485 stars 132 forks source link

[DNM] Fix 1720 perf clone #1758

Closed juanbono closed 1 month ago

juanbono commented 1 month ago

TITLE

Description

Description of the pull request changes and motivation.

Checklist

github-actions[bot] commented 1 month ago
**Hyper Thereading Benchmark results**

hyperfine -r 2 -n "hyper_threading_main threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_main' -n "hyper_threading_pr threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 1
  Time (mean ± σ):     27.255 s ±  0.075 s    [User: 26.425 s, System: 0.828 s]
  Range (min … max):   27.202 s … 27.308 s    2 runs

Benchmark 2: hyper_threading_pr threads: 1
  Time (mean ± σ):     27.309 s ±  0.056 s    [User: 26.551 s, System: 0.756 s]
  Range (min … max):   27.270 s … 27.349 s    2 runs

Summary
  'hyper_threading_main threads: 1' ran
    1.00 ± 0.00 times faster than 'hyper_threading_pr threads: 1'

hyperfine -r 2 -n "hyper_threading_main threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_main' -n "hyper_threading_pr threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 2
  Time (mean ± σ):     14.740 s ±  0.198 s    [User: 27.001 s, System: 0.808 s]
  Range (min … max):   14.600 s … 14.880 s    2 runs

Benchmark 2: hyper_threading_pr threads: 2
  Time (mean ± σ):     14.728 s ±  0.031 s    [User: 27.339 s, System: 0.873 s]
  Range (min … max):   14.707 s … 14.750 s    2 runs

Summary
  'hyper_threading_pr threads: 2' ran
    1.00 ± 0.01 times faster than 'hyper_threading_main threads: 2'

hyperfine -r 2 -n "hyper_threading_main threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_main' -n "hyper_threading_pr threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 4
  Time (mean ± σ):     11.151 s ±  0.011 s    [User: 38.600 s, System: 1.001 s]
  Range (min … max):   11.143 s … 11.158 s    2 runs

Benchmark 2: hyper_threading_pr threads: 4
  Time (mean ± σ):     10.839 s ±  0.150 s    [User: 39.059 s, System: 1.067 s]
  Range (min … max):   10.733 s … 10.946 s    2 runs

Summary
  'hyper_threading_pr threads: 4' ran
    1.03 ± 0.01 times faster than 'hyper_threading_main threads: 4'

hyperfine -r 2 -n "hyper_threading_main threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_main' -n "hyper_threading_pr threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 6
  Time (mean ± σ):     10.559 s ±  0.166 s    [User: 39.099 s, System: 1.101 s]
  Range (min … max):   10.442 s … 10.677 s    2 runs

Benchmark 2: hyper_threading_pr threads: 6
  Time (mean ± σ):     11.002 s ±  0.054 s    [User: 38.920 s, System: 1.020 s]
  Range (min … max):   10.963 s … 11.040 s    2 runs

Summary
  'hyper_threading_main threads: 6' ran
    1.04 ± 0.02 times faster than 'hyper_threading_pr threads: 6'

hyperfine -r 2 -n "hyper_threading_main threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_main' -n "hyper_threading_pr threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 8
  Time (mean ± σ):     10.592 s ±  0.017 s    [User: 39.246 s, System: 1.061 s]
  Range (min … max):   10.580 s … 10.604 s    2 runs

Benchmark 2: hyper_threading_pr threads: 8
  Time (mean ± σ):     10.718 s ±  0.130 s    [User: 39.116 s, System: 1.122 s]
  Range (min … max):   10.626 s … 10.810 s    2 runs

Summary
  'hyper_threading_main threads: 8' ran
    1.01 ± 0.01 times faster than 'hyper_threading_pr threads: 8'

hyperfine -r 2 -n "hyper_threading_main threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_main' -n "hyper_threading_pr threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 16
  Time (mean ± σ):     10.589 s ±  0.165 s    [User: 39.672 s, System: 1.065 s]
  Range (min … max):   10.472 s … 10.706 s    2 runs

Benchmark 2: hyper_threading_pr threads: 16
  Time (mean ± σ):     10.654 s ±  0.213 s    [User: 39.685 s, System: 1.144 s]
  Range (min … max):   10.504 s … 10.805 s    2 runs

Summary
  'hyper_threading_main threads: 16' ran
    1.01 ± 0.03 times faster than 'hyper_threading_pr threads: 16'
codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 94.76%. Comparing base (b25dae7) to head (1b8840d).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1758 +/- ## ======================================= Coverage 94.76% 94.76% ======================================= Files 101 101 Lines 38826 38827 +1 ======================================= + Hits 36795 36796 +1 Misses 2031 2031 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

github-actions[bot] commented 1 month ago

Benchmark Results for unmodified programs :rocket:

Command Mean [s] Min [s] Max [s] Relative
base big_factorial 2.053 ± 0.021 2.032 2.104 1.00
head big_factorial 2.060 ± 0.013 2.045 2.088 1.00 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base big_fibonacci 2.019 ± 0.020 1.992 2.065 1.01 ± 0.02
head big_fibonacci 1.994 ± 0.040 1.963 2.101 1.00
Command Mean [s] Min [s] Max [s] Relative
base blake2s_integration_benchmark 7.705 ± 0.059 7.605 7.761 1.03 ± 0.01
head blake2s_integration_benchmark 7.498 ± 0.043 7.405 7.560 1.00
Command Mean [s] Min [s] Max [s] Relative
base compare_arrays_200000 2.146 ± 0.049 2.101 2.277 1.00 ± 0.02
head compare_arrays_200000 2.136 ± 0.010 2.121 2.150 1.00
Command Mean [s] Min [s] Max [s] Relative
base dict_integration_benchmark 1.410 ± 0.011 1.402 1.435 1.00
head dict_integration_benchmark 1.427 ± 0.006 1.416 1.436 1.01 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base field_arithmetic_get_square_benchmark 1.314 ± 0.029 1.285 1.370 1.03 ± 0.02
head field_arithmetic_get_square_benchmark 1.280 ± 0.011 1.270 1.307 1.00
Command Mean [s] Min [s] Max [s] Relative
base integration_builtins 7.840 ± 0.484 7.587 9.202 1.04 ± 0.07
head integration_builtins 7.573 ± 0.184 7.411 7.990 1.00
Command Mean [s] Min [s] Max [s] Relative
base keccak_integration_benchmark 7.983 ± 0.156 7.864 8.400 1.04 ± 0.02
head keccak_integration_benchmark 7.701 ± 0.028 7.655 7.743 1.00
Command Mean [s] Min [s] Max [s] Relative
base linear_search 2.097 ± 0.034 2.064 2.157 1.00
head linear_search 2.110 ± 0.032 2.089 2.198 1.01 ± 0.02
Command Mean [s] Min [s] Max [s] Relative
base math_cmp_and_pow_integration_benchmark 1.703 ± 0.059 1.675 1.871 1.00
head math_cmp_and_pow_integration_benchmark 1.713 ± 0.022 1.697 1.772 1.01 ± 0.04
Command Mean [s] Min [s] Max [s] Relative
base math_integration_benchmark 1.598 ± 0.012 1.586 1.627 1.00
head math_integration_benchmark 1.615 ± 0.008 1.606 1.630 1.01 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base memory_integration_benchmark 1.201 ± 0.019 1.187 1.249 1.00
head memory_integration_benchmark 1.201 ± 0.009 1.191 1.219 1.00 ± 0.02
Command Mean [s] Min [s] Max [s] Relative
base operations_with_data_structures_benchmarks 1.815 ± 0.014 1.803 1.849 1.00 ± 0.01
head operations_with_data_structures_benchmarks 1.813 ± 0.007 1.806 1.831 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base pedersen 522.0 ± 10.4 514.0 541.5 1.00
head pedersen 523.4 ± 3.7 520.5 533.2 1.00 ± 0.02
Command Mean [ms] Min [ms] Max [ms] Relative
base poseidon_integration_benchmark 957.1 ± 14.3 940.9 987.0 1.00 ± 0.02
head poseidon_integration_benchmark 954.4 ± 15.2 946.7 996.5 1.00
Command Mean [s] Min [s] Max [s] Relative
base secp_integration_benchmark 1.851 ± 0.019 1.833 1.901 1.00
head secp_integration_benchmark 1.861 ± 0.006 1.856 1.874 1.01 ± 0.01
Command Mean [ms] Min [ms] Max [ms] Relative
base set_integration_benchmark 639.8 ± 3.7 635.8 648.8 1.00
head set_integration_benchmark 644.8 ± 8.2 638.0 667.3 1.01 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base uint256_integration_benchmark 4.233 ± 0.082 4.155 4.429 1.01 ± 0.05
head uint256_integration_benchmark 4.195 ± 0.181 4.091 4.703 1.00