Closed fmoletta closed 2 months ago
**Hyper Thereading Benchmark results**
hyperfine -r 2 -n "hyper_threading_main threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_main' -n "hyper_threading_pr threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 1
Time (mean ± σ): 23.767 s ± 0.001 s [User: 23.054 s, System: 0.712 s]
Range (min … max): 23.766 s … 23.768 s 2 runs
Benchmark 2: hyper_threading_pr threads: 1
Time (mean ± σ): 23.613 s ± 0.023 s [User: 22.903 s, System: 0.709 s]
Range (min … max): 23.597 s … 23.629 s 2 runs
Summary
'hyper_threading_pr threads: 1' ran
1.01 ± 0.00 times faster than 'hyper_threading_main threads: 1'
hyperfine -r 2 -n "hyper_threading_main threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_main' -n "hyper_threading_pr threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 2
Time (mean ± σ): 12.396 s ± 0.005 s [User: 23.600 s, System: 0.652 s]
Range (min … max): 12.393 s … 12.400 s 2 runs
Benchmark 2: hyper_threading_pr threads: 2
Time (mean ± σ): 12.286 s ± 0.022 s [User: 23.188 s, System: 0.650 s]
Range (min … max): 12.270 s … 12.302 s 2 runs
Summary
'hyper_threading_pr threads: 2' ran
1.01 ± 0.00 times faster than 'hyper_threading_main threads: 2'
hyperfine -r 2 -n "hyper_threading_main threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_main' -n "hyper_threading_pr threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 4
Time (mean ± σ): 10.099 s ± 0.279 s [User: 34.968 s, System: 0.909 s]
Range (min … max): 9.902 s … 10.297 s 2 runs
Benchmark 2: hyper_threading_pr threads: 4
Time (mean ± σ): 9.691 s ± 0.168 s [User: 35.073 s, System: 0.910 s]
Range (min … max): 9.572 s … 9.809 s 2 runs
Summary
'hyper_threading_pr threads: 4' ran
1.04 ± 0.03 times faster than 'hyper_threading_main threads: 4'
hyperfine -r 2 -n "hyper_threading_main threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_main' -n "hyper_threading_pr threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 6
Time (mean ± σ): 9.725 s ± 0.305 s [User: 34.944 s, System: 0.945 s]
Range (min … max): 9.509 s … 9.940 s 2 runs
Benchmark 2: hyper_threading_pr threads: 6
Time (mean ± σ): 9.847 s ± 0.108 s [User: 35.200 s, System: 0.984 s]
Range (min … max): 9.771 s … 9.924 s 2 runs
Summary
'hyper_threading_main threads: 6' ran
1.01 ± 0.03 times faster than 'hyper_threading_pr threads: 6'
hyperfine -r 2 -n "hyper_threading_main threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_main' -n "hyper_threading_pr threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 8
Time (mean ± σ): 9.492 s ± 0.129 s [User: 35.301 s, System: 0.924 s]
Range (min … max): 9.400 s … 9.583 s 2 runs
Benchmark 2: hyper_threading_pr threads: 8
Time (mean ± σ): 9.792 s ± 0.216 s [User: 35.314 s, System: 0.933 s]
Range (min … max): 9.639 s … 9.944 s 2 runs
Summary
'hyper_threading_main threads: 8' ran
1.03 ± 0.03 times faster than 'hyper_threading_pr threads: 8'
hyperfine -r 2 -n "hyper_threading_main threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_main' -n "hyper_threading_pr threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 16
Time (mean ± σ): 9.661 s ± 0.189 s [User: 35.211 s, System: 1.055 s]
Range (min … max): 9.527 s … 9.795 s 2 runs
Benchmark 2: hyper_threading_pr threads: 16
Time (mean ± σ): 9.760 s ± 0.055 s [User: 35.719 s, System: 1.048 s]
Range (min … max): 9.721 s … 9.799 s 2 runs
Summary
'hyper_threading_main threads: 16' ran
1.01 ± 0.02 times faster than 'hyper_threading_pr threads: 16'
Benchmark Results for unmodified programs :rocket:
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base big_factorial |
1.921 ± 0.023 | 1.895 | 1.957 | 1.02 ± 0.01 |
head big_factorial |
1.882 ± 0.011 | 1.868 | 1.895 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base big_fibonacci |
1.841 ± 0.017 | 1.820 | 1.869 | 1.00 |
head big_fibonacci |
1.843 ± 0.043 | 1.822 | 1.963 | 1.00 ± 0.03 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base blake2s_integration_benchmark |
6.391 ± 0.085 | 6.256 | 6.545 | 1.00 |
head blake2s_integration_benchmark |
6.411 ± 0.074 | 6.332 | 6.569 | 1.00 ± 0.02 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base compare_arrays_200000 |
1.919 ± 0.021 | 1.895 | 1.946 | 1.00 ± 0.02 |
head compare_arrays_200000 |
1.918 ± 0.034 | 1.897 | 2.006 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base dict_integration_benchmark |
1.284 ± 0.009 | 1.270 | 1.295 | 1.00 |
head dict_integration_benchmark |
1.289 ± 0.024 | 1.274 | 1.356 | 1.00 ± 0.02 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base field_arithmetic_get_square_benchmark |
1.158 ± 0.020 | 1.147 | 1.209 | 1.00 |
head field_arithmetic_get_square_benchmark |
1.164 ± 0.019 | 1.148 | 1.213 | 1.01 ± 0.02 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base integration_builtins |
6.491 ± 0.084 | 6.384 | 6.627 | 1.00 |
head integration_builtins |
6.502 ± 0.143 | 6.426 | 6.894 | 1.00 ± 0.03 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base keccak_integration_benchmark |
6.557 ± 0.061 | 6.501 | 6.669 | 1.00 |
head keccak_integration_benchmark |
6.582 ± 0.033 | 6.551 | 6.643 | 1.00 ± 0.01 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base linear_search |
1.953 ± 0.028 | 1.910 | 1.996 | 1.02 ± 0.02 |
head linear_search |
1.908 ± 0.009 | 1.898 | 1.919 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base math_cmp_and_pow_integration_benchmark |
1.676 ± 0.042 | 1.650 | 1.790 | 1.01 ± 0.03 |
head math_cmp_and_pow_integration_benchmark |
1.660 ± 0.004 | 1.654 | 1.667 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base math_integration_benchmark |
1.459 ± 0.018 | 1.448 | 1.507 | 1.00 ± 0.02 |
head math_integration_benchmark |
1.454 ± 0.021 | 1.441 | 1.511 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base memory_integration_benchmark |
1.117 ± 0.004 | 1.112 | 1.123 | 1.00 |
head memory_integration_benchmark |
1.140 ± 0.041 | 1.118 | 1.247 | 1.02 ± 0.04 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base operations_with_data_structures_benchmarks |
1.751 ± 0.017 | 1.727 | 1.778 | 1.01 ± 0.01 |
head operations_with_data_structures_benchmarks |
1.740 ± 0.013 | 1.728 | 1.769 | 1.00 |
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
base pedersen |
497.7 ± 2.9 | 494.2 | 502.6 | 1.00 |
head pedersen |
502.3 ± 2.3 | 500.5 | 507.9 | 1.01 ± 0.01 |
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
base poseidon_integration_benchmark |
936.2 ± 4.2 | 933.1 | 946.8 | 1.01 ± 0.01 |
head poseidon_integration_benchmark |
930.3 ± 6.9 | 925.9 | 949.6 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base secp_integration_benchmark |
1.728 ± 0.008 | 1.721 | 1.744 | 1.00 |
head secp_integration_benchmark |
1.739 ± 0.023 | 1.724 | 1.804 | 1.01 ± 0.01 |
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
base set_integration_benchmark |
680.8 ± 2.9 | 677.2 | 685.1 | 1.02 ± 0.01 |
head set_integration_benchmark |
669.1 ± 6.2 | 662.8 | 680.1 | 1.00 |
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
base uint256_integration_benchmark |
3.767 ± 0.101 | 3.672 | 4.039 | 1.01 ± 0.03 |
head uint256_integration_benchmark |
3.730 ± 0.063 | 3.688 | 3.884 | 1.00 |
Attention: Patch coverage is 97.60479%
with 4 lines
in your changes are missing coverage. Please review.
Project coverage is 94.89%. Comparing base (
1a46c9f
) to head (2fc0ba2
).
Files | Patch % | Lines |
---|---|---|
vm/src/vm/runners/cairo_pie.rs | 97.60% | 4 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
There seems to be a significant performance regression.
There seems to be a significant performance regression.
This is quite strange, the benchmarks don't produce or consume Cairo PIEs so they shouldn't be affected by this PR
This is quite strange, the benchmarks don't produce or consume Cairo PIEs so they shouldn't be affected by this PR
Whatever it was it seems to have been reversed by new commits. Maybe it was throttling in the worker.
Implements custom deserialization for structs using custom serialization. Implement method to deserialize a CairoPie from a zip file Move custom serialization functions into sub-modules along with their added deserialization counterparts to make use of the
serde(with)
attribute and make code cleaner