Closed fmoletta closed 4 months ago
**Hyper Thereading Benchmark results**
hyperfine -r 2 -n "hyper_threading_main threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_main' -n "hyper_threading_pr threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 1
Time (mean ± σ): 27.212 s ± 0.002 s [User: 26.331 s, System: 0.879 s]
Range (min … max): 27.211 s … 27.214 s 2 runs
Benchmark 2: hyper_threading_pr threads: 1
Time (mean ± σ): 26.887 s ± 0.077 s [User: 26.107 s, System: 0.778 s]
Range (min … max): 26.833 s … 26.942 s 2 runs
Summary
'hyper_threading_pr threads: 1' ran
1.01 ± 0.00 times faster than 'hyper_threading_main threads: 1'
hyperfine -r 2 -n "hyper_threading_main threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_main' -n "hyper_threading_pr threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 2
Time (mean ± σ): 14.597 s ± 0.013 s [User: 26.937 s, System: 0.829 s]
Range (min … max): 14.587 s … 14.606 s 2 runs
Benchmark 2: hyper_threading_pr threads: 2
Time (mean ± σ): 14.787 s ± 0.017 s [User: 26.764 s, System: 0.793 s]
Range (min … max): 14.776 s … 14.799 s 2 runs
Summary
'hyper_threading_main threads: 2' ran
1.01 ± 0.00 times faster than 'hyper_threading_pr threads: 2'
hyperfine -r 2 -n "hyper_threading_main threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_main' -n "hyper_threading_pr threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 4
Time (mean ± σ): 11.112 s ± 0.007 s [User: 38.620 s, System: 0.992 s]
Range (min … max): 11.107 s … 11.117 s 2 runs
Benchmark 2: hyper_threading_pr threads: 4
Time (mean ± σ): 10.602 s ± 0.413 s [User: 38.032 s, System: 0.933 s]
Range (min … max): 10.310 s … 10.893 s 2 runs
Summary
'hyper_threading_pr threads: 4' ran
1.05 ± 0.04 times faster than 'hyper_threading_main threads: 4'
hyperfine -r 2 -n "hyper_threading_main threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_main' -n "hyper_threading_pr threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 6
Time (mean ± σ): 10.734 s ± 0.229 s [User: 39.018 s, System: 0.998 s]
Range (min … max): 10.572 s … 10.896 s 2 runs
Benchmark 2: hyper_threading_pr threads: 6
Time (mean ± σ): 10.630 s ± 0.346 s [User: 38.248 s, System: 0.970 s]
Range (min … max): 10.385 s … 10.875 s 2 runs
Summary
'hyper_threading_pr threads: 6' ran
1.01 ± 0.04 times faster than 'hyper_threading_main threads: 6'
hyperfine -r 2 -n "hyper_threading_main threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_main' -n "hyper_threading_pr threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 8
Time (mean ± σ): 10.591 s ± 0.118 s [User: 39.393 s, System: 1.006 s]
Range (min … max): 10.508 s … 10.674 s 2 runs
Benchmark 2: hyper_threading_pr threads: 8
Time (mean ± σ): 10.379 s ± 0.040 s [User: 38.488 s, System: 1.041 s]
Range (min … max): 10.351 s … 10.407 s 2 runs
Summary
'hyper_threading_pr threads: 8' ran
1.02 ± 0.01 times faster than 'hyper_threading_main threads: 8'
hyperfine -r 2 -n "hyper_threading_main threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_main' -n "hyper_threading_pr threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 16
Time (mean ± σ): 10.662 s ± 0.041 s [User: 39.583 s, System: 1.014 s]
Range (min … max): 10.633 s … 10.691 s 2 runs
Benchmark 2: hyper_threading_pr threads: 16
Time (mean ± σ): 10.320 s ± 0.090 s [User: 38.833 s, System: 1.086 s]
Range (min … max): 10.257 s … 10.384 s 2 runs
Summary
'hyper_threading_pr threads: 16' ran
1.03 ± 0.01 times faster than 'hyper_threading_main threads: 16'
Benchmark Results for unmodified programs :rocket:
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base big_factorial |
2.041 ± 0.010 | 2.030 | 2.060 | 1.00 |
head big_factorial |
2.058 ± 0.059 | 2.027 | 2.225 | 1.01 ± 0.03 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base big_fibonacci |
1.993 ± 0.014 | 1.976 | 2.020 | 1.00 |
head big_fibonacci |
2.000 ± 0.018 | 1.979 | 2.031 | 1.00 ± 0.01 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base blake2s_integration_benchmark |
7.597 ± 0.076 | 7.490 | 7.721 | 1.00 |
head blake2s_integration_benchmark |
7.619 ± 0.151 | 7.457 | 7.952 | 1.00 ± 0.02 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base compare_arrays_200000 |
2.120 ± 0.029 | 2.094 | 2.175 | 1.01 ± 0.02 |
head compare_arrays_200000 |
2.107 ± 0.018 | 2.086 | 2.138 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base dict_integration_benchmark |
1.422 ± 0.020 | 1.407 | 1.478 | 1.01 ± 0.02 |
head dict_integration_benchmark |
1.402 ± 0.006 | 1.394 | 1.414 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base field_arithmetic_get_square_benchmark |
1.291 ± 0.017 | 1.276 | 1.336 | 1.00 ± 0.02 |
head field_arithmetic_get_square_benchmark |
1.289 ± 0.013 | 1.275 | 1.316 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base integration_builtins |
7.680 ± 0.134 | 7.520 | 7.985 | 1.01 ± 0.02 |
head integration_builtins |
7.624 ± 0.077 | 7.481 | 7.722 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base keccak_integration_benchmark |
7.873 ± 0.188 | 7.725 | 8.367 | 1.00 |
head keccak_integration_benchmark |
7.895 ± 0.100 | 7.710 | 8.026 | 1.00 ± 0.03 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base linear_search |
2.065 ± 0.011 | 2.051 | 2.086 | 1.00 |
head linear_search |
2.087 ± 0.031 | 2.053 | 2.144 | 1.01 ± 0.02 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base math_cmp_and_pow_integration_benchmark |
1.693 ± 0.006 | 1.681 | 1.701 | 1.01 ± 0.01 |
head math_cmp_and_pow_integration_benchmark |
1.675 ± 0.010 | 1.663 | 1.694 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base math_integration_benchmark |
1.598 ± 0.019 | 1.584 | 1.650 | 1.01 ± 0.02 |
head math_integration_benchmark |
1.585 ± 0.016 | 1.563 | 1.621 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base memory_integration_benchmark |
1.191 ± 0.004 | 1.184 | 1.200 | 1.00 ± 0.01 |
head memory_integration_benchmark |
1.188 ± 0.008 | 1.175 | 1.196 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base operations_with_data_structures_benchmarks |
1.828 ± 0.043 | 1.799 | 1.945 | 1.02 ± 0.02 |
head operations_with_data_structures_benchmarks |
1.798 ± 0.006 | 1.790 | 1.809 | 1.00 |
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
base pedersen |
524.8 ± 4.8 | 519.0 | 535.5 | 1.02 ± 0.01 |
head pedersen |
514.6 ± 5.7 | 511.8 | 530.7 | 1.00 |
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
base poseidon_integration_benchmark |
964.9 ± 4.3 | 957.5 | 971.3 | 1.00 |
head poseidon_integration_benchmark |
965.9 ± 6.2 | 959.2 | 979.2 | 1.00 ± 0.01 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base secp_integration_benchmark |
1.857 ± 0.020 | 1.838 | 1.898 | 1.01 ± 0.01 |
head secp_integration_benchmark |
1.848 ± 0.015 | 1.830 | 1.873 | 1.00 |
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
base set_integration_benchmark |
643.8 ± 5.4 | 639.3 | 657.7 | 1.00 |
head set_integration_benchmark |
660.0 ± 2.2 | 658.2 | 665.7 | 1.03 ± 0.01 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base uint256_integration_benchmark |
4.206 ± 0.037 | 4.168 | 4.291 | 1.00 |
head uint256_integration_benchmark |
4.245 ± 0.064 | 4.140 | 4.338 | 1.01 ± 0.02 |
PR #1720 Added a small error variant to the
CairoRunError
which brought a huge performance regression. This is due to theVmException
variant having a big size, making all other variants equally as big. This PR solves this issue by wrapping theVmException
contained in its corresponding variant, and adds a test to ensure that the size of CairoRunError doesn't surpass 32 bytes