Open matthewkeil opened 1 month ago
Ran perf tests on several machines. In all instances i stopped the major services checked with top
that nothing substantial was running before starting the tests
ovh-***-***-ksm-0
(x86) hasher
as-sha256
✓ hash 2 Uint8Array 500000 times - as-sha256 2.356940 ops/s 424.2789 ms/op - 3 runs 17.0 s
✓ hashTwoObjects 500000 times - as-sha256 2.534535 ops/s 394.5497 ms/op - 3 runs 15.8 s
✓ executeHashComputations - as-sha256 24.87851 ops/s 40.19533 ms/op - 23 runs 3.32 s
noble
✓ hash 2 Uint8Array 500000 times - noble 1.146827 ops/s 871.9712 ms/op - 3 runs 34.9 s
✓ hashTwoObjects 500000 times - noble 0.7592710 ops/s 1.317053 s/op - 3 runs 52.7 s
✓ executeHashComputations - noble 19.65518 ops/s 50.87717 ms/op - 5 runs 1.61 s
hashtree
✓ hash 2 Uint8Array 500000 times - hashtree 2.456077 ops/s 407.1533 ms/op - 3 runs 16.3 s
✓ hashTwoObjects 500000 times - hashtree 2.523110 ops/s 396.3362 ms/op - 3 runs 16.1 s
✓ executeHashComputations - hashtree 67.66811 ops/s 14.77801 ms/op - 21 runs 3.62 s
matthewkeil-ax41x
(x86) hasher
as-sha256
✓ hash 2 Uint8Array 500000 times - as-sha256 1.996066 ops/s 500.9855 ms/op - 3 runs 20.0 s
✓ hashTwoObjects 500000 times - as-sha256 2.187445 ops/s 457.1543 ms/op - 3 runs 18.3 s
✓ executeHashComputations - as-sha256 19.78581 ops/s 50.54128 ms/op - 16 runs 2.90 s
noble
✓ hash 2 Uint8Array 500000 times - noble 0.8846945 ops/s 1.130334 s/op - 3 runs 45.2 s
✓ hashTwoObjects 500000 times - noble 0.5145042 ops/s 1.943619 s/op - 3 runs 77.8 s
✓ executeHashComputations - noble 16.45166 ops/s 60.78414 ms/op - 20 runs 4.19 s
hashtree
✓ hash 2 Uint8Array 500000 times - hashtree 3.276744 ops/s 305.1810 ms/op - 3 runs 12.2 s
✓ hashTwoObjects 500000 times - hashtree 3.924958 ops/s 254.7798 ms/op - 3 runs 10.2 s
✓ executeHashComputations - hashtree 92.09554 ops/s 10.85829 ms/op - 62 runs 9.26 s
***-novc-***-cax11
(arm64) hasher
as-sha256
✓ hash 2 Uint8Array 500000 times - as-sha256 1.626183 ops/s 614.9369 ms/op - 3 runs 24.6 s
✓ hashTwoObjects 500000 times - as-sha256 1.709126 ops/s 585.0945 ms/op - 3 runs 23.4 s
✓ executeHashComputations - as-sha256 13.50287 ops/s 74.05835 ms/op - 9 runs 2.54 s
noble
✓ hash 2 Uint8Array 500000 times - noble 0.7180529 ops/s 1.392655 s/op - 3 runs 55.6 s
✓ hashTwoObjects 500000 times - noble 0.3339791 ops/s 2.994199 s/op - 3 runs 120 s
✓ executeHashComputations - noble 13.06625 ops/s 76.53303 ms/op - 9 runs 2.67 s
hashtree
✓ hash 2 Uint8Array 500000 times - hashtree 2.472214 ops/s 404.4957 ms/op - 3 runs 16.2 s
✓ hashTwoObjects 500000 times - hashtree 2.888047 ops/s 346.2547 ms/op - 3 runs 13.9 s
✓ executeHashComputations - hashtree 32.63111 ops/s 30.64560 ms/op - 27 runs 5.51 s
hasher
as-sha256
✓ hash 2 Uint8Array 500000 times - as-sha256 2.710482 ops/s 368.9380 ms/op - 3 runs 14.8 s
✓ hashTwoObjects 500000 times - as-sha256 2.882722 ops/s 346.8944 ms/op - 3 runs 13.9 s
✓ executeHashComputations - as-sha256 31.00534 ops/s 32.25251 ms/op - 7 runs 1.36 s
noble
✓ hash 2 Uint8Array 500000 times - noble 1.384059 ops/s 722.5124 ms/op - 3 runs 28.9 s
✓ hashTwoObjects 500000 times - noble 0.8926629 ops/s 1.120244 s/op - 3 runs 45.0 s
✓ executeHashComputations - noble 24.98550 ops/s 40.02322 ms/op - 36 runs 3.75 s
hashtree
✓ hash 2 Uint8Array 500000 times - hashtree 4.368933 ops/s 228.8888 ms/op - 4 runs 11.5 s
✓ hashTwoObjects 500000 times - hashtree 4.910207 ops/s 203.6574 ms/op - 4 runs 10.2 s
✓ executeHashComputations - hashtree 148.0720 ops/s 6.753470 ms/op - 96 runs 6.95 s
✔️ no performance regression detected
🚀🚀 Significant benchmark improvement detected
Benchmark suite | Current: 114ad0c603d9881e8b20aa792dd9fdf2c31b9d54 | Previous: ec123ec3cfcc37ff82635da7a57ad9c74cc9accb | Ratio |
---|---|---|---|
List(Validator-NS) 100000 struct -> binary | 7.4186 ms/op | 26.835 ms/op | 0.28 |
by benchmarkbot/action
Motivation
Created on behalf of @twoeths. Keeping in draft for now but using to store some metrics from different iterations of the changes.
Update system to allow for batch hashing of trees. SIMD instruction set allows for parallel processing of hashes when running on a single core. Add the idea of levels to the hashing process to allow deeper sections of a tree to be processed before higher levels and then computes the hashes according to depth of the tree.
Description
New
hashtree
hasher for batch hashMinimal memory allocation:
For
type.hashTreeRoot()
:type.hashTreeRoot()
almost does not need to allocate any temporary Uint8ArraysmerkleizeInto()
in hasher, this also does not allocate memory, and it uses batch hashFor ViewDU:
HashComputation = {left: Node; right: Node; dest: Node}
For BeaconState, it contains validators as ContainerNodeStructViewDUs
Benchmark result on Mac M1
3x faster on a
BeaconBlock.hashTreeRoot()
call, data comes from a typical mainnet block with 200 transactions7x faster on a BeaconState of 200k validators, 100k are modified
TODOs
TODO - batch
as-sha256
,persistent-merkle-tree
andssz
part of #355 closes #78