ChainSafe / ssz

Typescript implementation of Simple Serialize (SSZ)
https://simpleserialize.com/
Other
44 stars 17 forks source link

fix: improve digest64 for as-sha256 #347

Closed twoeths closed 4 months ago

twoeths commented 4 months ago

Motivation

Improve digest64

Description

Before:

digestTwoHashObjects vs digest64 vs digest
    ✓ digestTwoHashObjects 50023 times                                    27.71683 ops/s    36.07916 ms/op   x1.042        264 runs   10.1 s
    ✓ digest64 50023 times                                                26.85761 ops/s    37.23339 ms/op   x1.020        256 runs   10.1 s
    ✓ digest 50023 times                                                  27.02430 ops/s    37.00373 ms/op   x1.010        259 runs   10.1 s

After

digestTwoHashObjects vs digest64 vs digest
    ✓ digestTwoHashObjects 50023 times                                    30.54306 ops/s    32.74066 ms/op   x0.946        293 runs   10.1 s
    ✓ digest64 50023 times                                                28.28273 ops/s    35.35727 ms/op   x0.968        271 runs   10.1 s
    ✓ digest 50023 times                                                  27.93469 ops/s    35.79778 ms/op   x0.977        268 runs   10.1 s

digestTwoHashObjects is ~10% faster and digest64 is ~5% faster

github-actions[bot] commented 4 months ago

Performance Report

✔️ no performance regression detected

Full benchmark results | Benchmark suite | Current: c98ffc3b09b0f3517e9749e43c3721161e8b1a9c | Previous: 6220d320b004ea80bd30925487ac0f3299295528 | Ratio | |-|-|-|-| | digestTwoHashObjects 50023 times | 50.422 ms/op | 47.451 ms/op | 1.06 | | digest64 50023 times | 52.434 ms/op | 48.513 ms/op | 1.08 | | digest 50023 times | 56.875 ms/op | 49.148 ms/op | 1.16 | | input length 32 | 1.4860 us/op | 1.1400 us/op | 1.30 | | input length 64 | 1.6190 us/op | 1.2700 us/op | 1.27 | | input length 128 | 2.6960 us/op | 2.2140 us/op | 1.22 | | input length 256 | 3.9120 us/op | 3.3030 us/op | 1.18 | | input length 512 | 6.3690 us/op | 5.4920 us/op | 1.16 | | input length 1024 | 12.181 us/op | 10.835 us/op | 1.12 | | digest 1000000 times | 915.35 ms/op | 777.08 ms/op | 1.18 | | hashObjectToByteArray 50023 times | 1.4298 ms/op | 1.4645 ms/op | 0.98 | | byteArrayToHashObject 50023 times | 3.5234 ms/op | 1.6627 ms/op | 2.12 | | getGindicesAtDepth | 4.9400 us/op | 3.9520 us/op | 1.25 | | iterateAtDepth | 10.341 us/op | 8.4810 us/op | 1.22 | | getGindexBits | 550.00 ns/op | 423.00 ns/op | 1.30 | | gindexIterator | 1.2260 us/op | 952.00 ns/op | 1.29 | | hash 2 Uint8Array 2250026 times - as-sha256 | 2.4520 s/op | 2.2238 s/op | 1.10 | | hashTwoObjects 2250026 times - as-sha256 | 2.3468 s/op | 2.1564 s/op | 1.09 | | hash 2 Uint8Array 2250026 times - noble | 5.7821 s/op | 4.6188 s/op | 1.25 | | hashTwoObjects 2250026 times - noble | 6.7199 s/op | 6.7975 s/op | 0.99 | | getNodeH() x7812.5 avg hindex | 15.143 us/op | 14.548 us/op | 1.04 | | getNodeH() x7812.5 index 0 | 5.1000 us/op | 5.1490 us/op | 0.99 | | getNodeH() x7812.5 index 7 | 5.0850 us/op | 5.1510 us/op | 0.99 | | getNodeH() x7812.5 index 7 with key array | 5.1250 us/op | 5.0550 us/op | 1.01 | | new LeafNode() x7812.5 | 185.96 us/op | 113.37 us/op | 1.64 | | multiproof - depth 15, 1 requested leaves | 10.461 us/op | 9.4680 us/op | 1.10 | | tree offset multiproof - depth 15, 1 requested leaves | 21.004 us/op | 20.621 us/op | 1.02 | | compact multiproof - depth 15, 1 requested leaves | 5.4010 us/op | 5.5070 us/op | 0.98 | | multiproof - depth 15, 2 requested leaves | 13.529 us/op | 12.922 us/op | 1.05 | | tree offset multiproof - depth 15, 2 requested leaves | 24.425 us/op | 23.147 us/op | 1.06 | | compact multiproof - depth 15, 2 requested leaves | 3.3670 us/op | 3.3580 us/op | 1.00 | | multiproof - depth 15, 3 requested leaves | 18.887 us/op | 17.866 us/op | 1.06 | | tree offset multiproof - depth 15, 3 requested leaves | 32.142 us/op | 30.400 us/op | 1.06 | | compact multiproof - depth 15, 3 requested leaves | 5.5470 us/op | 4.6320 us/op | 1.20 | | multiproof - depth 15, 4 requested leaves | 23.653 us/op | 23.828 us/op | 0.99 | | tree offset multiproof - depth 15, 4 requested leaves | 36.790 us/op | 38.089 us/op | 0.97 | | compact multiproof - depth 15, 4 requested leaves | 5.3340 us/op | 5.2770 us/op | 1.01 | | packedRootsBytesToLeafNodes bytes 4000 offset 0 | 1.9700 us/op | 1.9990 us/op | 0.99 | | packedRootsBytesToLeafNodes bytes 4000 offset 1 | 1.9630 us/op | 2.0010 us/op | 0.98 | | packedRootsBytesToLeafNodes bytes 4000 offset 2 | 1.9750 us/op | 2.0090 us/op | 0.98 | | packedRootsBytesToLeafNodes bytes 4000 offset 3 | 1.9600 us/op | 1.9920 us/op | 0.98 | | subtreeFillToContents depth 40 count 250000 | 44.369 ms/op | 43.686 ms/op | 1.02 | | setRoot - gindexBitstring | 7.7733 ms/op | 8.7390 ms/op | 0.89 | | setRoot - gindex | 7.9447 ms/op | 9.4315 ms/op | 0.84 | | getRoot - gindexBitstring | 2.4620 ms/op | 2.4095 ms/op | 1.02 | | getRoot - gindex | 3.0494 ms/op | 3.2480 ms/op | 0.94 | | getHashObject then setHashObject | 8.9442 ms/op | 10.367 ms/op | 0.86 | | setNodeWithFn | 7.7888 ms/op | 9.1894 ms/op | 0.85 | | getNodeAtDepth depth 0 x100000 | 1.1449 ms/op | 1.1471 ms/op | 1.00 | | setNodeAtDepth depth 0 x100000 | 2.3224 ms/op | 2.7974 ms/op | 0.83 | | getNodesAtDepth depth 0 x100000 | 1.0851 ms/op | 1.0834 ms/op | 1.00 | | setNodesAtDepth depth 0 x100000 | 1.4850 ms/op | 1.4935 ms/op | 0.99 | | getNodeAtDepth depth 1 x100000 | 1.2058 ms/op | 1.2074 ms/op | 1.00 | | setNodeAtDepth depth 1 x100000 | 5.0174 ms/op | 5.7980 ms/op | 0.87 | | getNodesAtDepth depth 1 x100000 | 1.2103 ms/op | 1.2088 ms/op | 1.00 | | setNodesAtDepth depth 1 x100000 | 4.2492 ms/op | 4.6802 ms/op | 0.91 | | getNodeAtDepth depth 2 x100000 | 1.4841 ms/op | 1.4852 ms/op | 1.00 | | setNodeAtDepth depth 2 x100000 | 8.7215 ms/op | 10.474 ms/op | 0.83 | | getNodesAtDepth depth 2 x100000 | 17.382 ms/op | 20.510 ms/op | 0.85 | | setNodesAtDepth depth 2 x100000 | 12.650 ms/op | 13.700 ms/op | 0.92 | | tree.getNodesAtDepth - gindexes | 5.2689 ms/op | 6.0009 ms/op | 0.88 | | tree.getNodesAtDepth - push all nodes | 1.7729 ms/op | 2.2066 ms/op | 0.80 | | tree.getNodesAtDepth - navigation | 155.36 us/op | 157.91 us/op | 0.98 | | tree.setNodesAtDepth - indexes | 301.87 us/op | 374.22 us/op | 0.81 | | set at depth 8 | 470.00 ns/op | 514.00 ns/op | 0.91 | | set at depth 16 | 611.00 ns/op | 664.00 ns/op | 0.92 | | set at depth 32 | 929.00 ns/op | 1.0370 us/op | 0.90 | | iterateNodesAtDepth 8 256 | 13.531 us/op | 13.906 us/op | 0.97 | | getNodesAtDepth 8 256 | 3.3180 us/op | 3.3930 us/op | 0.98 | | iterateNodesAtDepth 16 65536 | 4.1928 ms/op | 4.2306 ms/op | 0.99 | | getNodesAtDepth 16 65536 | 1.5660 ms/op | 2.0146 ms/op | 0.78 | | iterateNodesAtDepth 32 250000 | 15.631 ms/op | 16.975 ms/op | 0.92 | | getNodesAtDepth 32 250000 | 4.2045 ms/op | 4.2259 ms/op | 0.99 | | iterateNodesAtDepth 40 250000 | 15.293 ms/op | 14.758 ms/op | 1.04 | | getNodesAtDepth 40 250000 | 4.2270 ms/op | 4.2287 ms/op | 1.00 | | 250k validators | 6.8209 s/op | 7.0207 s/op | 0.97 | | bitlist bytes to struct (120,90) | 603.00 ns/op | 576.00 ns/op | 1.05 | | bitlist bytes to tree (120,90) | 2.3540 us/op | 2.2300 us/op | 1.06 | | bitlist bytes to struct (2048,2048) | 1.0210 us/op | 998.00 ns/op | 1.02 | | bitlist bytes to tree (2048,2048) | 3.6340 us/op | 3.4800 us/op | 1.04 | | ByteListType - deserialize | 8.1662 ms/op | 8.1587 ms/op | 1.00 | | BasicListType - deserialize | 7.6215 ms/op | 7.8204 ms/op | 0.97 | | ByteListType - serialize | 8.2848 ms/op | 7.4434 ms/op | 1.11 | | BasicListType - serialize | 9.8041 ms/op | 9.9380 ms/op | 0.99 | | BasicListType - tree_convertToStruct | 21.497 ms/op | 21.339 ms/op | 1.01 | | List[uint8, 68719476736] len 300000 ViewDU.getAll() + iterate | 4.0829 ms/op | 4.2130 ms/op | 0.97 | | List[uint8, 68719476736] len 300000 ViewDU.get(i) | 4.1790 ms/op | 4.3100 ms/op | 0.97 | | Array.push len 300000 empty Array - number | 6.0675 ms/op | 6.2098 ms/op | 0.98 | | Array.set len 300000 from new Array - number | 1.6174 ms/op | 1.6359 ms/op | 0.99 | | Array.set len 300000 - number | 5.1398 ms/op | 5.0942 ms/op | 1.01 | | Uint8Array.set len 300000 | 203.55 us/op | 208.02 us/op | 0.98 | | Uint32Array.set len 300000 | 273.95 us/op | 291.01 us/op | 0.94 | | Container({a: uint8, b: uint8}) getViewDU x300000 | 19.551 ms/op | 19.737 ms/op | 0.99 | | ContainerNodeStruct({a: uint8, b: uint8}) getViewDU x300000 | 9.3055 ms/op | 9.3814 ms/op | 0.99 | | List(Container) len 300000 ViewDU.getAllReadonly() + iterate | 198.82 ms/op | 246.18 ms/op | 0.81 | | List(Container) len 300000 ViewDU.getAllReadonlyValues() + iterate | 280.86 ms/op | 293.67 ms/op | 0.96 | | List(Container) len 300000 ViewDU.get(i) | 6.3686 ms/op | 6.6744 ms/op | 0.95 | | List(Container) len 300000 ViewDU.getReadonly(i) | 6.1368 ms/op | 6.5862 ms/op | 0.93 | | List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonly() + iterate | 36.704 ms/op | 38.008 ms/op | 0.97 | | List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonlyValues() + iterate | 5.1006 ms/op | 5.1821 ms/op | 0.98 | | List(ContainerNodeStruct) len 300000 ViewDU.get(i) | 6.0304 ms/op | 6.1416 ms/op | 0.98 | | List(ContainerNodeStruct) len 300000 ViewDU.getReadonly(i) | 5.9073 ms/op | 5.9534 ms/op | 0.99 | | Array.push len 300000 empty Array - object | 5.7186 ms/op | 5.8513 ms/op | 0.98 | | Array.set len 300000 from new Array - object | 1.9168 ms/op | 1.9138 ms/op | 1.00 | | Array.set len 300000 - object | 5.4213 ms/op | 5.7212 ms/op | 0.95 | | cachePermanentRootStruct no cache | 9.1910 us/op | 8.9470 us/op | 1.03 | | cachePermanentRootStruct with cache | 218.00 ns/op | 217.00 ns/op | 1.00 | | epochParticipation len 250000 rws 7813 | 2.1996 ms/op | 2.3121 ms/op | 0.95 | | deserialize Attestation - tree | 2.9210 us/op | 2.8750 us/op | 1.02 | | deserialize Attestation - struct | 1.9580 us/op | 1.9240 us/op | 1.02 | | deserialize SignedAggregateAndProof - tree | 3.6560 us/op | 3.6870 us/op | 0.99 | | deserialize SignedAggregateAndProof - struct | 2.9860 us/op | 2.9650 us/op | 1.01 | | deserialize SyncCommitteeMessage - tree | 1.1020 us/op | 1.1820 us/op | 0.93 | | deserialize SyncCommitteeMessage - struct | 1.1780 us/op | 1.2800 us/op | 0.92 | | deserialize SignedContributionAndProof - tree | 1.9370 us/op | 1.9150 us/op | 1.01 | | deserialize SignedContributionAndProof - struct | 2.4380 us/op | 2.4950 us/op | 0.98 | | deserialize SignedBeaconBlock - tree | 208.59 us/op | 218.40 us/op | 0.96 | | deserialize SignedBeaconBlock - struct | 124.47 us/op | 125.55 us/op | 0.99 | | BeaconState vc 300000 - deserialize tree | 541.71 ms/op | 647.97 ms/op | 0.84 | | BeaconState vc 300000 - serialize tree | 116.93 ms/op | 142.22 ms/op | 0.82 | | BeaconState.historicalRoots vc 300000 - deserialize tree | 861.00 ns/op | 826.00 ns/op | 1.04 | | BeaconState.historicalRoots vc 300000 - serialize tree | 828.00 ns/op | 800.00 ns/op | 1.03 | | BeaconState.validators vc 300000 - deserialize tree | 494.51 ms/op | 646.73 ms/op | 0.76 | | BeaconState.validators vc 300000 - serialize tree | 120.82 ms/op | 133.13 ms/op | 0.91 | | BeaconState.balances vc 300000 - deserialize tree | 19.744 ms/op | 23.300 ms/op | 0.85 | | BeaconState.balances vc 300000 - serialize tree | 3.1007 ms/op | 3.2543 ms/op | 0.95 | | BeaconState.previousEpochParticipation vc 300000 - deserialize tree | 375.95 us/op | 398.81 us/op | 0.94 | | BeaconState.previousEpochParticipation vc 300000 - serialize tree | 264.03 us/op | 265.41 us/op | 0.99 | | BeaconState.currentEpochParticipation vc 300000 - deserialize tree | 369.74 us/op | 403.44 us/op | 0.92 | | BeaconState.currentEpochParticipation vc 300000 - serialize tree | 261.99 us/op | 267.45 us/op | 0.98 | | BeaconState.inactivityScores vc 300000 - deserialize tree | 20.539 ms/op | 26.134 ms/op | 0.79 | | BeaconState.inactivityScores vc 300000 - serialize tree | 3.2488 ms/op | 2.7084 ms/op | 1.20 | | hashTreeRoot Attestation - struct | 32.321 us/op | 28.230 us/op | 1.14 | | hashTreeRoot Attestation - tree | 21.854 us/op | 18.223 us/op | 1.20 | | hashTreeRoot SignedAggregateAndProof - struct | 43.954 us/op | 38.544 us/op | 1.14 | | hashTreeRoot SignedAggregateAndProof - tree | 29.687 us/op | 27.823 us/op | 1.07 | | hashTreeRoot SyncCommitteeMessage - struct | 10.439 us/op | 9.1740 us/op | 1.14 | | hashTreeRoot SyncCommitteeMessage - tree | 6.5740 us/op | 6.2360 us/op | 1.05 | | hashTreeRoot SignedContributionAndProof - struct | 29.414 us/op | 26.322 us/op | 1.12 | | hashTreeRoot SignedContributionAndProof - tree | 21.208 us/op | 19.913 us/op | 1.07 | | hashTreeRoot SignedBeaconBlock - struct | 2.4260 ms/op | 2.4007 ms/op | 1.01 | | hashTreeRoot SignedBeaconBlock - tree | 1.7714 ms/op | 1.6938 ms/op | 1.05 | | hashTreeRoot Validator - struct | 13.179 us/op | 13.052 us/op | 1.01 | | hashTreeRoot Validator - tree | 11.492 us/op | 11.057 us/op | 1.04 | | BeaconState vc 300000 - hashTreeRoot tree | 3.7534 s/op | 3.6480 s/op | 1.03 | | BeaconState.historicalRoots vc 300000 - hashTreeRoot tree | 1.5020 us/op | 1.5200 us/op | 0.99 | | BeaconState.validators vc 300000 - hashTreeRoot tree | 3.5881 s/op | 3.5204 s/op | 1.02 | | BeaconState.balances vc 300000 - hashTreeRoot tree | 90.885 ms/op | 90.631 ms/op | 1.00 | | BeaconState.previousEpochParticipation vc 300000 - hashTreeRoot tree | 9.5870 ms/op | 9.3016 ms/op | 1.03 | | BeaconState.currentEpochParticipation vc 300000 - hashTreeRoot tree | 9.6064 ms/op | 9.0244 ms/op | 1.06 | | BeaconState.inactivityScores vc 300000 - hashTreeRoot tree | 85.335 ms/op | 86.049 ms/op | 0.99 | | hash64 x18 | 20.566 us/op | 19.179 us/op | 1.07 | | hashTwoObjects x18 | 18.860 us/op | 17.936 us/op | 1.05 | | hash64 x1740 | 1.9647 ms/op | 1.8306 ms/op | 1.07 | | hashTwoObjects x1740 | 1.8020 ms/op | 1.7038 ms/op | 1.06 | | hash64 x2700000 | 3.0170 s/op | 2.8221 s/op | 1.07 | | hashTwoObjects x2700000 | 2.7907 s/op | 2.6427 s/op | 1.06 | | get_exitEpoch - ContainerType | 224.00 ns/op | 212.00 ns/op | 1.06 | | get_exitEpoch - ContainerNodeStructType | 212.00 ns/op | 207.00 ns/op | 1.02 | | set_exitEpoch - ContainerType | 246.00 ns/op | 250.00 ns/op | 0.98 | | set_exitEpoch - ContainerNodeStructType | 219.00 ns/op | 206.00 ns/op | 1.06 | | get_pubkey - ContainerType | 1.0230 us/op | 959.00 ns/op | 1.07 | | get_pubkey - ContainerNodeStructType | 222.00 ns/op | 212.00 ns/op | 1.05 | | hashTreeRoot - ContainerType | 391.00 ns/op | 355.00 ns/op | 1.10 | | hashTreeRoot - ContainerNodeStructType | 428.00 ns/op | 402.00 ns/op | 1.06 | | createProof - ContainerType | 3.8900 us/op | 3.8640 us/op | 1.01 | | createProof - ContainerNodeStructType | 21.044 us/op | 20.811 us/op | 1.01 | | serialize - ContainerType | 1.9690 us/op | 1.9330 us/op | 1.02 | | serialize - ContainerNodeStructType | 1.5290 us/op | 1.4700 us/op | 1.04 | | set_exitEpoch_and_hashTreeRoot - ContainerType | 4.2110 us/op | 4.2160 us/op | 1.00 | | set_exitEpoch_and_hashTreeRoot - ContainerNodeStructType | 11.669 us/op | 11.253 us/op | 1.04 | | Array - for of | 5.0560 us/op | 5.0570 us/op | 1.00 | | Array - for(;;) | 4.3090 us/op | 4.3590 us/op | 0.99 | | basicListValue.readonlyValuesArray() | 3.7785 ms/op | 3.6135 ms/op | 1.05 | | basicListValue.readonlyValuesArray() + loop all | 3.8469 ms/op | 3.6827 ms/op | 1.04 | | compositeListValue.readonlyValuesArray() | 28.718 ms/op | 27.018 ms/op | 1.06 | | compositeListValue.readonlyValuesArray() + loop all | 24.882 ms/op | 24.661 ms/op | 1.01 | | Number64UintType - get balances list | 4.0450 ms/op | 4.2458 ms/op | 0.95 | | Number64UintType - set balances list | 10.964 ms/op | 10.992 ms/op | 1.00 | | Number64UintType - get and increase 10 then set | 39.201 ms/op | 36.256 ms/op | 1.08 | | Number64UintType - increase 10 using applyDelta | 16.489 ms/op | 15.966 ms/op | 1.03 | | Number64UintType - increase 10 using applyDeltaInBatch | 16.189 ms/op | 16.781 ms/op | 0.96 | | tree_newTreeFromUint64Deltas | 15.819 ms/op | 17.078 ms/op | 0.93 | | unsafeUint8ArrayToTree | 28.508 ms/op | 32.144 ms/op | 0.89 | | bitLength(50) | 232.00 ns/op | 225.00 ns/op | 1.03 | | bitLengthStr(50) | 241.00 ns/op | 238.00 ns/op | 1.01 | | bitLength(8000) | 225.00 ns/op | 229.00 ns/op | 0.98 | | bitLengthStr(8000) | 281.00 ns/op | 284.00 ns/op | 0.99 | | bitLength(250000) | 222.00 ns/op | 219.00 ns/op | 1.01 | | bitLengthStr(250000) | 315.00 ns/op | 311.00 ns/op | 1.01 | | floor - Math.floor (53) | 0.46426 ns/op | 0.46415 ns/op | 1.00 | | floor - << 0 (53) | 0.47021 ns/op | 0.46692 ns/op | 1.01 | | floor - Math.floor (512) | 0.46411 ns/op | 0.46665 ns/op | 0.99 | | floor - << 0 (512) | 0.47051 ns/op | 0.47120 ns/op | 1.00 | | fnIf(0) | 1.5460 ns/op | 1.5463 ns/op | 1.00 | | fnSwitch(0) | 2.5146 ns/op | 2.5137 ns/op | 1.00 | | fnObj(0) | 0.46456 ns/op | 0.46444 ns/op | 1.00 | | fnArr(0) | 0.47123 ns/op | 0.46435 ns/op | 1.01 | | fnIf(4) | 2.1634 ns/op | 2.1653 ns/op | 1.00 | | fnSwitch(4) | 2.4749 ns/op | 2.4724 ns/op | 1.00 | | fnObj(4) | 0.46528 ns/op | 0.46471 ns/op | 1.00 | | fnArr(4) | 0.46381 ns/op | 0.46520 ns/op | 1.00 | | fnIf(9) | 3.0936 ns/op | 3.0949 ns/op | 1.00 | | fnSwitch(9) | 2.4735 ns/op | 2.4881 ns/op | 0.99 | | fnObj(9) | 0.46458 ns/op | 0.46454 ns/op | 1.00 | | fnArr(9) | 0.46460 ns/op | 0.46414 ns/op | 1.00 | | Container {a,b,vec} - as struct x100000 | 46.589 us/op | 47.184 us/op | 0.99 | | Container {a,b,vec} - as tree x100000 | 372.67 us/op | 371.42 us/op | 1.00 | | Container {a,vec,b} - as struct x100000 | 77.566 us/op | 77.488 us/op | 1.00 | | Container {a,vec,b} - as tree x100000 | 402.12 us/op | 402.42 us/op | 1.00 | | get 2 props x1000000 - rawObject | 309.45 us/op | 310.21 us/op | 1.00 | | get 2 props x1000000 - proxy | 72.112 ms/op | 74.359 ms/op | 0.97 | | get 2 props x1000000 - customObj | 309.62 us/op | 309.78 us/op | 1.00 | | Simple object binary -> struct | 738.00 ns/op | 613.00 ns/op | 1.20 | | Simple object binary -> tree_backed | 1.9920 us/op | 1.6940 us/op | 1.18 | | Simple object struct -> tree_backed | 2.6050 us/op | 2.1200 us/op | 1.23 | | Simple object tree_backed -> struct | 2.1770 us/op | 1.8700 us/op | 1.16 | | Simple object struct -> binary | 1.0740 us/op | 913.00 ns/op | 1.18 | | Simple object tree_backed -> binary | 1.8160 us/op | 1.6560 us/op | 1.10 | | aggregationBits binary -> struct | 706.00 ns/op | 550.00 ns/op | 1.28 | | aggregationBits binary -> tree_backed | 2.6600 us/op | 2.0700 us/op | 1.29 | | aggregationBits struct -> tree_backed | 3.0450 us/op | 2.5230 us/op | 1.21 | | aggregationBits tree_backed -> struct | 1.3160 us/op | 1.0490 us/op | 1.25 | | aggregationBits struct -> binary | 917.00 ns/op | 754.00 ns/op | 1.22 | | aggregationBits tree_backed -> binary | 1.1560 us/op | 973.00 ns/op | 1.19 | | List(uint8) 100000 binary -> struct | 1.3401 ms/op | 1.2988 ms/op | 1.03 | | List(uint8) 100000 binary -> tree_backed | 86.537 us/op | 86.831 us/op | 1.00 | | List(uint8) 100000 struct -> tree_backed | 1.3134 ms/op | 1.3055 ms/op | 1.01 | | List(uint8) 100000 tree_backed -> struct | 914.10 us/op | 953.73 us/op | 0.96 | | List(uint8) 100000 struct -> binary | 1.2212 ms/op | 1.2184 ms/op | 1.00 | | List(uint8) 100000 tree_backed -> binary | 79.928 us/op | 80.666 us/op | 0.99 | | List(uint64Number) 100000 binary -> struct | 1.1579 ms/op | 1.2325 ms/op | 0.94 | | List(uint64Number) 100000 binary -> tree_backed | 2.9832 ms/op | 3.7674 ms/op | 0.79 | | List(uint64Number) 100000 struct -> tree_backed | 4.5223 ms/op | 5.0615 ms/op | 0.89 | | List(uint64Number) 100000 tree_backed -> struct | 2.0292 ms/op | 2.1579 ms/op | 0.94 | | List(uint64Number) 100000 struct -> binary | 1.4061 ms/op | 1.3691 ms/op | 1.03 | | List(uint64Number) 100000 tree_backed -> binary | 764.18 us/op | 764.34 us/op | 1.00 | | List(Uint64Bigint) 100000 binary -> struct | 3.3298 ms/op | 3.2282 ms/op | 1.03 | | List(Uint64Bigint) 100000 binary -> tree_backed | 3.0010 ms/op | 3.8412 ms/op | 0.78 | | List(Uint64Bigint) 100000 struct -> tree_backed | 5.2933 ms/op | 6.0559 ms/op | 0.87 | | List(Uint64Bigint) 100000 tree_backed -> struct | 4.1589 ms/op | 4.0536 ms/op | 1.03 | | List(Uint64Bigint) 100000 struct -> binary | 2.0807 ms/op | 2.0217 ms/op | 1.03 | | List(Uint64Bigint) 100000 tree_backed -> binary | 832.63 us/op | 738.20 us/op | 1.13 | | Vector(Root) 100000 binary -> struct | 28.447 ms/op | 33.636 ms/op | 0.85 | | Vector(Root) 100000 binary -> tree_backed | 23.861 ms/op | 31.353 ms/op | 0.76 | | Vector(Root) 100000 struct -> tree_backed | 33.251 ms/op | 41.874 ms/op | 0.79 | | Vector(Root) 100000 tree_backed -> struct | 41.762 ms/op | 49.350 ms/op | 0.85 | | Vector(Root) 100000 struct -> binary | 1.8166 ms/op | 1.9152 ms/op | 0.95 | | Vector(Root) 100000 tree_backed -> binary | 8.3545 ms/op | 9.5106 ms/op | 0.88 | | List(Validator) 100000 binary -> struct | 100.14 ms/op | 135.38 ms/op | 0.74 | | List(Validator) 100000 binary -> tree_backed | 249.34 ms/op | 348.85 ms/op | 0.71 | | List(Validator) 100000 struct -> tree_backed | 288.63 ms/op | 359.65 ms/op | 0.80 | | List(Validator) 100000 tree_backed -> struct | 189.17 ms/op | 219.65 ms/op | 0.86 | | List(Validator) 100000 struct -> binary | 33.099 ms/op | 30.378 ms/op | 1.09 | | List(Validator) 100000 tree_backed -> binary | 92.889 ms/op | 104.90 ms/op | 0.89 | | List(Validator-NS) 100000 binary -> struct | 91.057 ms/op | 119.72 ms/op | 0.76 | | List(Validator-NS) 100000 binary -> tree_backed | 145.08 ms/op | 180.69 ms/op | 0.80 | | List(Validator-NS) 100000 struct -> tree_backed | 184.59 ms/op | 211.23 ms/op | 0.87 | | List(Validator-NS) 100000 tree_backed -> struct | 148.78 ms/op | 174.34 ms/op | 0.85 | | List(Validator-NS) 100000 struct -> binary | 34.233 ms/op | 30.620 ms/op | 1.12 | | List(Validator-NS) 100000 tree_backed -> binary | 36.547 ms/op | 35.677 ms/op | 1.02 | | get epochStatuses - MutableVector | 91.102 us/op | 90.831 us/op | 1.00 | | get epochStatuses - ViewDU | 198.17 us/op | 196.06 us/op | 1.01 | | set epochStatuses - ListTreeView | 1.3816 ms/op | 1.4870 ms/op | 0.93 | | set epochStatuses - ListTreeView - set() | 445.26 us/op | 428.31 us/op | 1.04 | | set epochStatuses - ListTreeView - commit() | 400.52 us/op | 399.22 us/op | 1.00 | | bitstring | 638.90 ns/op | 645.20 ns/op | 0.99 | | bit mask | 13.575 ns/op | 13.595 ns/op | 1.00 | | struct - increase slot to 1000000 | 927.68 us/op | 928.22 us/op | 1.00 | | UintNumberType - increase slot to 1000000 | 28.494 ms/op | 28.482 ms/op | 1.00 | | UintBigintType - increase slot to 1000000 | 396.51 ms/op | 436.52 ms/op | 0.91 | | UintBigint8 x 100000 tree_deserialize | 5.0856 ms/op | 4.7991 ms/op | 1.06 | | UintBigint8 x 100000 tree_serialize | 1.1865 ms/op | 1.1908 ms/op | 1.00 | | UintBigint16 x 100000 tree_deserialize | 4.6953 ms/op | 4.8965 ms/op | 0.96 | | UintBigint16 x 100000 tree_serialize | 1.1271 ms/op | 1.2277 ms/op | 0.92 | | UintBigint32 x 100000 tree_deserialize | 4.7340 ms/op | 4.9104 ms/op | 0.96 | | UintBigint32 x 100000 tree_serialize | 1.1904 ms/op | 1.3207 ms/op | 0.90 | | UintBigint64 x 100000 tree_deserialize | 5.4564 ms/op | 5.5974 ms/op | 0.97 | | UintBigint64 x 100000 tree_serialize | 1.5585 ms/op | 1.6834 ms/op | 0.93 | | UintBigint8 x 100000 value_deserialize | 433.67 us/op | 433.14 us/op | 1.00 | | UintBigint8 x 100000 value_serialize | 566.32 us/op | 624.50 us/op | 0.91 | | UintBigint16 x 100000 value_deserialize | 464.10 us/op | 469.52 us/op | 0.99 | | UintBigint16 x 100000 value_serialize | 602.98 us/op | 665.25 us/op | 0.91 | | UintBigint32 x 100000 value_deserialize | 433.55 us/op | 436.70 us/op | 0.99 | | UintBigint32 x 100000 value_serialize | 602.85 us/op | 679.50 us/op | 0.89 | | UintBigint64 x 100000 value_deserialize | 468.03 us/op | 466.75 us/op | 1.00 | | UintBigint64 x 100000 value_serialize | 780.83 us/op | 873.46 us/op | 0.89 | | UintBigint8 x 100000 deserialize | 4.6647 ms/op | 5.0061 ms/op | 0.93 | | UintBigint8 x 100000 serialize | 1.3983 ms/op | 1.5161 ms/op | 0.92 | | UintBigint16 x 100000 deserialize | 4.5283 ms/op | 4.9129 ms/op | 0.92 | | UintBigint16 x 100000 serialize | 1.4281 ms/op | 1.5124 ms/op | 0.94 | | UintBigint32 x 100000 deserialize | 5.2401 ms/op | 5.6090 ms/op | 0.93 | | UintBigint32 x 100000 serialize | 2.7495 ms/op | 2.9267 ms/op | 0.94 | | UintBigint64 x 100000 deserialize | 3.6058 ms/op | 3.9061 ms/op | 0.92 | | UintBigint64 x 100000 serialize | 1.5021 ms/op | 1.4994 ms/op | 1.00 | | UintBigint128 x 100000 deserialize | 5.6700 ms/op | 5.9063 ms/op | 0.96 | | UintBigint128 x 100000 serialize | 17.095 ms/op | 18.068 ms/op | 0.95 | | UintBigint256 x 100000 deserialize | 10.691 ms/op | 11.348 ms/op | 0.94 | | UintBigint256 x 100000 serialize | 49.981 ms/op | 52.134 ms/op | 0.96 | | Slice from Uint8Array x25000 | 996.58 us/op | 1.0121 ms/op | 0.98 | | Slice from ArrayBuffer x25000 | 16.574 ms/op | 16.961 ms/op | 0.98 | | Slice from ArrayBuffer x25000 + new Uint8Array | 16.708 ms/op | 18.343 ms/op | 0.91 | | Copy Uint8Array 100000 iterate | 788.05 us/op | 805.99 us/op | 0.98 | | Copy Uint8Array 100000 slice | 85.438 us/op | 90.656 us/op | 0.94 | | Copy Uint8Array 100000 Uint8Array.prototype.slice.call | 85.216 us/op | 90.636 us/op | 0.94 | | Copy Buffer 100000 Uint8Array.prototype.slice.call | 85.889 us/op | 90.549 us/op | 0.95 | | Copy Uint8Array 100000 slice + set | 139.15 us/op | 155.83 us/op | 0.89 | | Copy Uint8Array 100000 subarray + set | 85.598 us/op | 90.799 us/op | 0.94 | | Copy Uint8Array 100000 slice arrayBuffer | 85.959 us/op | 90.931 us/op | 0.95 | | Uint64 deserialize 100000 - iterate Uint8Array | 1.7066 ms/op | 1.7466 ms/op | 0.98 | | Uint64 deserialize 100000 - by Uint32A | 1.6871 ms/op | 1.7571 ms/op | 0.96 | | Uint64 deserialize 100000 - by DataView.getUint32 x2 | 1.6879 ms/op | 1.7582 ms/op | 0.96 | | Uint64 deserialize 100000 - by DataView.getBigUint64 | 4.8040 ms/op | 5.2691 ms/op | 0.91 | | Uint64 deserialize 100000 - by byte | 65.006 ms/op | 42.976 ms/op | 1.51 |

by benchmarkbot/action

twoeths commented 4 months ago

the benchmark result in CI is not precise because it run with very short period of time, updated it to run at least 10s

setBenchOpts({
    minMs: 10000,
  });
twoeths commented 4 months ago

Do we need to store build artifacts in the repo?

yes, they are actually input for our typescript function every time we run, that's why they are tracked in git

jeluard commented 4 months ago

Do we need to store build artifacts in the repo?

yes, they are actually input for our typescript function every time we run, that's why they are tracked in git

Is this what you are referring to? Looks like we only care about codegen being called to update https://github.com/ChainSafe/ssz/blob/master/packages/as-sha256/src/wasmCode.ts (as correctly done by this PR), but wasm/wat don't need to be in git?

Not part of this PR anyway, maybe something to be considered to improve the build process.

twoeths commented 4 months ago

checked the performance on a separate server and got same result to my local environment

master feat4 mainnet

digestTwoHashObjects vs digest64 vs digest
    ✓ digestTwoHashObjects 50023 times                                    19.25844 ops/s    51.92528 ms/op        -         67 runs   3.98 s
    ✓ digest64 50023 times                                                20.23583 ops/s    49.41730 ms/op        -         12 runs   1.11 s
    ✓ digest 50023 times                                                  20.31116 ops/s    49.23402 ms/op        -         13 runs   1.19 s

this branch

digestTwoHashObjects vs digest64 vs digest
    ✓ digestTwoHashObjects 50023 times                                    21.91978 ops/s    45.62089 ms/op        -       1304 runs   60.0 s
    ✓ digest64 50023 times                                                21.11148 ops/s    47.36759 ms/op        -       1256 runs   60.0 s
    ✓ digest 50023 times                                                  20.90464 ops/s    47.83628 ms/op        -       1244 runs   60.0 s

however benchmark result in CI consistently shows worse statistics so need to investigate this, may need to separate to smaller PRs in order to figure out the issue