Closed fmoletta closed 5 months ago
**Hyper Thereading Benchmark results**
hyperfine -r 2 -n "hyper_threading_main threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_main' -n "hyper_threading_pr threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 1
Time (mean ± σ): 30.347 s ± 0.165 s [User: 29.631 s, System: 0.714 s]
Range (min … max): 30.230 s … 30.463 s 2 runs
Benchmark 2: hyper_threading_pr threads: 1
Time (mean ± σ): 30.497 s ± 0.063 s [User: 29.734 s, System: 0.762 s]
Range (min … max): 30.452 s … 30.542 s 2 runs
Summary
'hyper_threading_main threads: 1' ran
1.00 ± 0.01 times faster than 'hyper_threading_pr threads: 1'
hyperfine -r 2 -n "hyper_threading_main threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_main' -n "hyper_threading_pr threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 2
Time (mean ± σ): 16.224 s ± 0.037 s [User: 29.951 s, System: 0.718 s]
Range (min … max): 16.198 s … 16.250 s 2 runs
Benchmark 2: hyper_threading_pr threads: 2
Time (mean ± σ): 16.249 s ± 0.011 s [User: 30.011 s, System: 0.712 s]
Range (min … max): 16.241 s … 16.257 s 2 runs
Summary
'hyper_threading_main threads: 2' ran
1.00 ± 0.00 times faster than 'hyper_threading_pr threads: 2'
hyperfine -r 2 -n "hyper_threading_main threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_main' -n "hyper_threading_pr threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 4
Time (mean ± σ): 12.047 s ± 0.010 s [User: 42.072 s, System: 0.941 s]
Range (min … max): 12.040 s … 12.054 s 2 runs
Benchmark 2: hyper_threading_pr threads: 4
Time (mean ± σ): 11.569 s ± 0.391 s [User: 42.293 s, System: 0.916 s]
Range (min … max): 11.292 s … 11.845 s 2 runs
Summary
'hyper_threading_pr threads: 4' ran
1.04 ± 0.04 times faster than 'hyper_threading_main threads: 4'
hyperfine -r 2 -n "hyper_threading_main threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_main' -n "hyper_threading_pr threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 6
Time (mean ± σ): 11.634 s ± 0.156 s [User: 41.785 s, System: 0.933 s]
Range (min … max): 11.523 s … 11.744 s 2 runs
Benchmark 2: hyper_threading_pr threads: 6
Time (mean ± σ): 11.702 s ± 0.050 s [User: 41.669 s, System: 1.012 s]
Range (min … max): 11.667 s … 11.738 s 2 runs
Summary
'hyper_threading_main threads: 6' ran
1.01 ± 0.01 times faster than 'hyper_threading_pr threads: 6'
hyperfine -r 2 -n "hyper_threading_main threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_main' -n "hyper_threading_pr threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 8
Time (mean ± σ): 11.277 s ± 0.151 s [User: 42.107 s, System: 0.974 s]
Range (min … max): 11.171 s … 11.384 s 2 runs
Benchmark 2: hyper_threading_pr threads: 8
Time (mean ± σ): 11.402 s ± 0.135 s [User: 42.155 s, System: 0.984 s]
Range (min … max): 11.306 s … 11.497 s 2 runs
Summary
'hyper_threading_main threads: 8' ran
1.01 ± 0.02 times faster than 'hyper_threading_pr threads: 8'
hyperfine -r 2 -n "hyper_threading_main threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_main' -n "hyper_threading_pr threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 16
Time (mean ± σ): 11.349 s ± 0.294 s [User: 42.798 s, System: 0.989 s]
Range (min … max): 11.141 s … 11.558 s 2 runs
Benchmark 2: hyper_threading_pr threads: 16
Time (mean ± σ): 11.334 s ± 0.178 s [User: 42.446 s, System: 1.075 s]
Range (min … max): 11.208 s … 11.460 s 2 runs
Summary
'hyper_threading_pr threads: 16' ran
1.00 ± 0.03 times faster than 'hyper_threading_main threads: 16'
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 94.80%. Comparing base (
0df3f34
) to head (3c1ad7c
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Benchmark Results for unmodified programs :rocket:
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base big_factorial |
2.371 ± 0.021 | 2.355 | 2.425 | 1.00 ± 0.01 |
head big_factorial |
2.365 ± 0.015 | 2.349 | 2.390 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base big_fibonacci |
2.358 ± 0.013 | 2.341 | 2.389 | 1.02 ± 0.01 |
head big_fibonacci |
2.320 ± 0.014 | 2.296 | 2.341 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base blake2s_integration_benchmark |
8.752 ± 0.111 | 8.590 | 8.868 | 1.00 ± 0.02 |
head blake2s_integration_benchmark |
8.737 ± 0.119 | 8.571 | 8.977 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base compare_arrays_200000 |
2.422 ± 0.029 | 2.388 | 2.478 | 1.01 ± 0.02 |
head compare_arrays_200000 |
2.404 ± 0.022 | 2.368 | 2.432 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base dict_integration_benchmark |
1.556 ± 0.003 | 1.549 | 1.560 | 1.01 ± 0.01 |
head dict_integration_benchmark |
1.544 ± 0.016 | 1.528 | 1.583 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base field_arithmetic_get_square_benchmark |
1.432 ± 0.012 | 1.417 | 1.448 | 1.00 |
head field_arithmetic_get_square_benchmark |
1.444 ± 0.032 | 1.413 | 1.505 | 1.01 ± 0.02 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base integration_builtins |
8.683 ± 0.106 | 8.562 | 8.854 | 1.00 ± 0.02 |
head integration_builtins |
8.657 ± 0.089 | 8.552 | 8.796 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base keccak_integration_benchmark |
8.941 ± 0.119 | 8.806 | 9.070 | 1.00 ± 0.02 |
head keccak_integration_benchmark |
8.909 ± 0.096 | 8.779 | 9.038 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base linear_search |
2.437 ± 0.021 | 2.409 | 2.462 | 1.01 ± 0.01 |
head linear_search |
2.417 ± 0.007 | 2.408 | 2.429 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base math_cmp_and_pow_integration_benchmark |
1.914 ± 0.021 | 1.892 | 1.967 | 1.00 ± 0.01 |
head math_cmp_and_pow_integration_benchmark |
1.906 ± 0.014 | 1.891 | 1.936 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base math_integration_benchmark |
1.712 ± 0.009 | 1.703 | 1.727 | 1.01 ± 0.01 |
head math_integration_benchmark |
1.700 ± 0.009 | 1.685 | 1.715 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base memory_integration_benchmark |
1.340 ± 0.007 | 1.329 | 1.352 | 1.00 ± 0.01 |
head memory_integration_benchmark |
1.333 ± 0.007 | 1.325 | 1.346 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base operations_with_data_structures_benchmarks |
1.998 ± 0.012 | 1.982 | 2.024 | 1.01 ± 0.01 |
head operations_with_data_structures_benchmarks |
1.975 ± 0.008 | 1.963 | 1.985 | 1.00 |
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
base pedersen |
562.2 ± 2.4 | 557.4 | 567.1 | 1.00 ± 0.01 |
head pedersen |
562.1 ± 2.9 | 559.4 | 568.3 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base poseidon_integration_benchmark |
1.014 ± 0.023 | 1.005 | 1.079 | 1.01 ± 0.02 |
head poseidon_integration_benchmark |
1.003 ± 0.005 | 0.994 | 1.010 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base secp_integration_benchmark |
2.001 ± 0.020 | 1.980 | 2.044 | 1.00 ± 0.02 |
head secp_integration_benchmark |
1.997 ± 0.023 | 1.976 | 2.055 | 1.00 |
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
base set_integration_benchmark |
751.2 ± 4.9 | 746.7 | 760.0 | 1.01 ± 0.01 |
head set_integration_benchmark |
745.7 ± 2.8 | 741.6 | 749.4 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base uint256_integration_benchmark |
4.844 ± 0.069 | 4.780 | 4.946 | 1.00 |
head uint256_integration_benchmark |
4.847 ± 0.039 | 4.775 | 4.902 | 1.00 ± 0.02 |
After PR #1686 the return values are now fetched from the output segment instead of the execution segment when either the
append_return_values
orproof_mode
flags are enabled, this makes ourcheck_append_ret_values_to_output_segment
not as useful as it no longer checks that the correct values are being copied to the output segment. A way to fix it would be to instead compare the values from the execution segment to the ones on the output segment, but after PR #1721 when the segment arena is used the execution segment now contains the values produced by the arena validation where the return values used to be found, so we can no longer use it for comparison. As a result of these two changes, the best way to test that the correct values are copied to the output segment are to 1: test that the output segment indeed has the return values outputted and no more values after them (current behaviour), and 2: test that the return values outputted when running with--append_return_values
match the ones outputted by a normal run (this can be accomplished by adding a third case with--append_return_values
enabled to our in integration tests. This PR adds this third test case and also refactors the integration tests to into one test with multiple cases & values so adding new checks and argument combinations is easier.