ChainSafe / lodestar

🌟 TypeScript Implementation of Ethereum Consensus
https://lodestar.chainsafe.io
Apache License 2.0
1.15k stars 283 forks source link

feat: use napi-rs pubkey-index-map #7091

Closed twoeths closed 1 day ago

twoeths commented 1 week ago

Motivation

Description

benchmark on Mac M1 is comparable to the current PubkeyIndexMap:

  get/set
    ✓ get values - 1000                                                    2808989 ops/s    356.0000 ns/op   x0.450    1313217 runs  0.707 s
    ✓ get values - naive - 1000                                            3802281 ops/s    263.0000 ns/op   x0.358    1423603 runs  0.606 s
    ✓ set values - 1000                                                    2932551 ops/s    341.0000 ns/op   x0.432    2900838 runs   1.51 s
    ✓ set values - naive - 1000                                            2136752 ops/s    468.0000 ns/op   x0.497     637433 runs  0.404 s
    ✓ get values - 1000000                                                948766.6 ops/s    1.054000 us/op   x0.568     407133 runs  0.505 s
    ✓ get values - naive - 1000000                                         1254705 ops/s    797.0000 ns/op   x0.680     627761 runs  0.606 s
    ✓ set values - 1000000                                                972762.6 ops/s    1.028000 us/op   x0.544     499266 runs  0.606 s
    ✓ set values - naive - 1000000                                        637755.1 ops/s    1.568000 us/op   x0.805     346993 runs  0.606 s

see also https://github.com/ChainSafe/lodestar/pull/7022#issuecomment-2291984293

github-actions[bot] commented 1 week ago

:warning: Performance Alert :warning:

Possible performance regression was detected for some benchmarks. Benchmark result of this commit is worse than the previous benchmark result exceeding threshold.

Benchmark suite Current: db5d962f247e4f74bbe31a0e9c713dffa65d5a4e Previous: cd98c237683456be7959d752bbd7d10f8b02b8ec Ratio
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 9.0531 ms/op 2.7366 ms/op 3.31
Array.fill - length 1000000 8.3394 ms/op 2.4610 ms/op 3.39
Array push - length 1000000 52.419 ms/op 14.302 ms/op 3.67
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 7.4445 ms/op 1.4955 ms/op 4.98
Buffer.compare 123687377 12.803 ms/op 3.7482 ms/op 3.42
Full benchmark results | Benchmark suite | Current: db5d962f247e4f74bbe31a0e9c713dffa65d5a4e | Previous: cd98c237683456be7959d752bbd7d10f8b02b8ec | Ratio | |-|-|-|-| | getPubkeys - index2pubkey - req 1000 vs - 250000 vc | 2.0463 ms/op | 1.8504 ms/op | 1.11 | | getPubkeys - validatorsArr - req 1000 vs - 250000 vc | 57.846 us/op | 38.162 us/op | 1.52 | | BLS verify - blst | 933.05 us/op | 881.84 us/op | 1.06 | | BLS verifyMultipleSignatures 3 - blst | 1.2229 ms/op | 1.2653 ms/op | 0.97 | | BLS verifyMultipleSignatures 8 - blst | 1.6690 ms/op | 2.2057 ms/op | 0.76 | | BLS verifyMultipleSignatures 32 - blst | 5.0739 ms/op | 4.4707 ms/op | 1.13 | | BLS verifyMultipleSignatures 64 - blst | 9.7137 ms/op | 8.2785 ms/op | 1.17 | | BLS verifyMultipleSignatures 128 - blst | 17.747 ms/op | 15.960 ms/op | 1.11 | | BLS deserializing 10000 signatures | 694.75 ms/op | 628.39 ms/op | 1.11 | | BLS deserializing 100000 signatures | 7.1191 s/op | 6.2640 s/op | 1.14 | | BLS verifyMultipleSignatures - same message - 3 - blst | 1.0087 ms/op | 834.37 us/op | 1.21 | | BLS verifyMultipleSignatures - same message - 8 - blst | 1.2011 ms/op | 1.1042 ms/op | 1.09 | | BLS verifyMultipleSignatures - same message - 32 - blst | 1.8084 ms/op | 1.7002 ms/op | 1.06 | | BLS verifyMultipleSignatures - same message - 64 - blst | 2.9001 ms/op | 2.5742 ms/op | 1.13 | | BLS verifyMultipleSignatures - same message - 128 - blst | 5.0871 ms/op | 4.1876 ms/op | 1.21 | | BLS aggregatePubkeys 32 - blst | 20.911 us/op | 18.278 us/op | 1.14 | | BLS aggregatePubkeys 128 - blst | 73.696 us/op | 63.653 us/op | 1.16 | | notSeenSlots=1 numMissedVotes=1 numBadVotes=10 | 71.317 ms/op | 45.506 ms/op | 1.57 | | notSeenSlots=1 numMissedVotes=0 numBadVotes=4 | 67.771 ms/op | 38.289 ms/op | 1.77 | | notSeenSlots=2 numMissedVotes=1 numBadVotes=10 | 37.244 ms/op | 30.478 ms/op | 1.22 | | getSlashingsAndExits - default max | 112.77 us/op | 69.209 us/op | 1.63 | | getSlashingsAndExits - 2k | 355.98 us/op | 236.00 us/op | 1.51 | | proposeBlockBody type=full, size=empty | 6.7436 ms/op | 4.9830 ms/op | 1.35 | | isKnown best case - 1 super set check | 511.00 ns/op | 487.00 ns/op | 1.05 | | isKnown normal case - 2 super set checks | 554.00 ns/op | 468.00 ns/op | 1.18 | | isKnown worse case - 16 super set checks | 330.00 ns/op | 473.00 ns/op | 0.70 | | InMemoryCheckpointStateCache - add get delete | 3.6640 us/op | 2.6330 us/op | 1.39 | | updateUnfinalizedPubkeys - updating 10 pubkeys | 1.4196 ms/op | 568.65 us/op | 2.50 | | updateUnfinalizedPubkeys - updating 100 pubkeys | 4.6168 ms/op | 2.5670 ms/op | 1.80 | | updateUnfinalizedPubkeys - updating 1000 pubkeys | 56.658 ms/op | 38.027 ms/op | 1.49 | | validate api signedAggregateAndProof - struct | 1.6231 ms/op | 1.8025 ms/op | 0.90 | | validate gossip signedAggregateAndProof - struct | 1.6264 ms/op | 1.9103 ms/op | 0.85 | | validate gossip attestation - vc 640000 | 1.0784 ms/op | 985.22 us/op | 1.09 | | batch validate gossip attestation - vc 640000 - chunk 32 | 147.70 us/op | 117.14 us/op | 1.26 | | batch validate gossip attestation - vc 640000 - chunk 64 | 127.75 us/op | 105.41 us/op | 1.21 | | batch validate gossip attestation - vc 640000 - chunk 128 | 120.10 us/op | 100.21 us/op | 1.20 | | batch validate gossip attestation - vc 640000 - chunk 256 | 134.81 us/op | 97.151 us/op | 1.39 | | pickEth1Vote - no votes | 1.1984 ms/op | 930.13 us/op | 1.29 | | pickEth1Vote - max votes | 6.3301 ms/op | 9.2127 ms/op | 0.69 | | pickEth1Vote - Eth1Data hashTreeRoot value x2048 | 16.920 ms/op | 18.737 ms/op | 0.90 | | pickEth1Vote - Eth1Data hashTreeRoot tree x2048 | 29.637 ms/op | 26.059 ms/op | 1.14 | | pickEth1Vote - Eth1Data fastSerialize value x2048 | 646.40 us/op | 368.10 us/op | 1.76 | | pickEth1Vote - Eth1Data fastSerialize tree x2048 | 5.0824 ms/op | 4.2315 ms/op | 1.20 | | bytes32 toHexString | 857.00 ns/op | 556.00 ns/op | 1.54 | | bytes32 Buffer.toString(hex) | 284.00 ns/op | 418.00 ns/op | 0.68 | | bytes32 Buffer.toString(hex) from Uint8Array | 559.00 ns/op | 522.00 ns/op | 1.07 | | bytes32 Buffer.toString(hex) + 0x | 278.00 ns/op | 418.00 ns/op | 0.67 | | Object access 1 prop | 0.22200 ns/op | 0.30800 ns/op | 0.72 | | Map access 1 prop | 0.14300 ns/op | 0.30900 ns/op | 0.46 | | Object get x1000 | 7.0240 ns/op | 4.8930 ns/op | 1.44 | | Map get x1000 | 6.7550 ns/op | 5.5820 ns/op | 1.21 | | Object set x1000 | 60.605 ns/op | 26.472 ns/op | 2.29 | | Map set x1000 | 40.056 ns/op | 18.790 ns/op | 2.13 | | Return object 10000 times | 0.33480 ns/op | 0.28140 ns/op | 1.19 | | Throw Error 10000 times | 3.7209 us/op | 2.5401 us/op | 1.46 | | toHex | 194.69 ns/op | 103.41 ns/op | 1.88 | | Buffer.from | 181.21 ns/op | 93.817 ns/op | 1.93 | | shared Buffer | 114.59 ns/op | 64.011 ns/op | 1.79 | | fastMsgIdFn sha256 / 200 bytes | 2.5290 us/op | 1.9020 us/op | 1.33 | | fastMsgIdFn h32 xxhash / 200 bytes | 338.00 ns/op | 387.00 ns/op | 0.87 | | fastMsgIdFn h64 xxhash / 200 bytes | 301.00 ns/op | 447.00 ns/op | 0.67 | | fastMsgIdFn sha256 / 1000 bytes | 8.2650 us/op | 5.8110 us/op | 1.42 | | fastMsgIdFn h32 xxhash / 1000 bytes | 454.00 ns/op | 517.00 ns/op | 0.88 | | fastMsgIdFn h64 xxhash / 1000 bytes | 377.00 ns/op | 498.00 ns/op | 0.76 | | fastMsgIdFn sha256 / 10000 bytes | 74.170 us/op | 48.720 us/op | 1.52 | | fastMsgIdFn h32 xxhash / 10000 bytes | 2.1690 us/op | 1.8810 us/op | 1.15 | | fastMsgIdFn h64 xxhash / 10000 bytes | 1.3720 us/op | 1.3070 us/op | 1.05 | | send data - 1000 256B messages | 18.522 ms/op | 10.639 ms/op | 1.74 | | send data - 1000 512B messages | 24.303 ms/op | 12.658 ms/op | 1.92 | | send data - 1000 1024B messages | 40.143 ms/op | 20.930 ms/op | 1.92 | | send data - 1000 1200B messages | 37.909 ms/op | 12.899 ms/op | 2.94 | | send data - 1000 2048B messages | 42.692 ms/op | 27.732 ms/op | 1.54 | | send data - 1000 4096B messages | 38.952 ms/op | 24.031 ms/op | 1.62 | | send data - 1000 16384B messages | 91.603 ms/op | 64.315 ms/op | 1.42 | | send data - 1000 65536B messages | 298.09 ms/op | 247.31 ms/op | 1.21 | | enrSubnets - fastDeserialize 64 bits | 1.6420 us/op | 1.2150 us/op | 1.35 | | enrSubnets - ssz BitVector 64 bits | 494.00 ns/op | 512.00 ns/op | 0.96 | | enrSubnets - fastDeserialize 4 bits | 235.00 ns/op | 323.00 ns/op | 0.73 | | enrSubnets - ssz BitVector 4 bits | 546.00 ns/op | 499.00 ns/op | 1.09 | | prioritizePeers score -10:0 att 32-0.1 sync 2-0 | 266.42 us/op | 117.19 us/op | 2.27 | | prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 | 263.18 us/op | 149.43 us/op | 1.76 | | prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 | 428.04 us/op | 197.07 us/op | 2.17 | | prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 | 565.93 us/op | 319.34 us/op | 1.77 | | prioritizePeers score 0:0 att 64-1 sync 4-1 | 1.2163 ms/op | 446.28 us/op | 2.73 | | array of 16000 items push then shift | 2.1771 us/op | 1.2813 us/op | 1.70 | | LinkedList of 16000 items push then shift | 15.125 ns/op | 6.5920 ns/op | 2.29 | | array of 16000 items push then pop | 173.61 ns/op | 65.339 ns/op | 2.66 | | LinkedList of 16000 items push then pop | 11.423 ns/op | 6.4510 ns/op | 1.77 | | array of 24000 items push then shift | 2.8511 us/op | 1.8914 us/op | 1.51 | | LinkedList of 24000 items push then shift | 8.9930 ns/op | 6.5360 ns/op | 1.38 | | array of 24000 items push then pop | 203.49 ns/op | 136.73 ns/op | 1.49 | | LinkedList of 24000 items push then pop | 8.1660 ns/op | 6.4370 ns/op | 1.27 | | intersect bitArray bitLen 8 | 7.3370 ns/op | 5.4530 ns/op | 1.35 | | intersect array and set length 8 | 74.823 ns/op | 40.600 ns/op | 1.84 | | intersect bitArray bitLen 128 | 39.225 ns/op | 26.640 ns/op | 1.47 | | intersect array and set length 128 | 1.0329 us/op | 591.97 ns/op | 1.74 | | bitArray.getTrueBitIndexes() bitLen 128 | 2.5560 us/op | 1.5280 us/op | 1.67 | | bitArray.getTrueBitIndexes() bitLen 248 | 4.9060 us/op | 3.0990 us/op | 1.58 | | bitArray.getTrueBitIndexes() bitLen 512 | 9.2320 us/op | 7.1400 us/op | 1.29 | | Buffer.concat 32 items | 1.2150 us/op | 1.0190 us/op | 1.19 | | Uint8Array.set 32 items | 1.8610 us/op | 2.2020 us/op | 0.85 | | Buffer.copy | 2.1320 us/op | 2.3710 us/op | 0.90 | | Uint8Array.set - with subarray | 4.0420 us/op | 2.9220 us/op | 1.38 | | Uint8Array.set - without subarray | 1.9990 us/op | 2.1660 us/op | 0.92 | | getUint32 - dataview | 347.00 ns/op | 400.00 ns/op | 0.87 | | getUint32 - manual | 234.00 ns/op | 338.00 ns/op | 0.69 | | Set add up to 64 items then delete first | 2.4349 us/op | 1.8062 us/op | 1.35 | | OrderedSet add up to 64 items then delete first | 4.0775 us/op | 2.7996 us/op | 1.46 | | Set add up to 64 items then delete last | 2.8734 us/op | 2.0487 us/op | 1.40 | | OrderedSet add up to 64 items then delete last | 5.4998 us/op | 3.1410 us/op | 1.75 | | Set add up to 64 items then delete middle | 3.8992 us/op | 2.0586 us/op | 1.89 | | OrderedSet add up to 64 items then delete middle | 7.1664 us/op | 4.4819 us/op | 1.60 | | Set add up to 128 items then delete first | 7.4846 us/op | 4.0249 us/op | 1.86 | | OrderedSet add up to 128 items then delete first | 10.967 us/op | 6.3124 us/op | 1.74 | | Set add up to 128 items then delete last | 7.4172 us/op | 3.8805 us/op | 1.91 | | OrderedSet add up to 128 items then delete last | 11.486 us/op | 5.9322 us/op | 1.94 | | Set add up to 128 items then delete middle | 7.3934 us/op | 3.8867 us/op | 1.90 | | OrderedSet add up to 128 items then delete middle | 18.831 us/op | 11.930 us/op | 1.58 | | Set add up to 256 items then delete first | 14.938 us/op | 7.8725 us/op | 1.90 | | OrderedSet add up to 256 items then delete first | 24.918 us/op | 12.624 us/op | 1.97 | | Set add up to 256 items then delete last | 16.327 us/op | 7.6494 us/op | 2.13 | | OrderedSet add up to 256 items then delete last | 23.214 us/op | 11.767 us/op | 1.97 | | Set add up to 256 items then delete middle | 14.756 us/op | 7.6054 us/op | 1.94 | | OrderedSet add up to 256 items then delete middle | 54.529 us/op | 34.698 us/op | 1.57 | | transfer serialized Status (84 B) | 1.5460 us/op | 1.3660 us/op | 1.13 | | copy serialized Status (84 B) | 1.4070 us/op | 1.2750 us/op | 1.10 | | transfer serialized SignedVoluntaryExit (112 B) | 1.6190 us/op | 1.6670 us/op | 0.97 | | copy serialized SignedVoluntaryExit (112 B) | 1.5960 us/op | 1.3480 us/op | 1.18 | | transfer serialized ProposerSlashing (416 B) | 2.0110 us/op | 2.0090 us/op | 1.00 | | copy serialized ProposerSlashing (416 B) | 2.0600 us/op | 1.9000 us/op | 1.08 | | transfer serialized Attestation (485 B) | 2.0250 us/op | 2.0880 us/op | 0.97 | | copy serialized Attestation (485 B) | 2.0320 us/op | 2.0440 us/op | 0.99 | | transfer serialized AttesterSlashing (33232 B) | 2.0820 us/op | 2.1590 us/op | 0.96 | | copy serialized AttesterSlashing (33232 B) | 10.093 us/op | 4.5730 us/op | 2.21 | | transfer serialized Small SignedBeaconBlock (128000 B) | 3.8900 us/op | 3.1140 us/op | 1.25 | | copy serialized Small SignedBeaconBlock (128000 B) | 30.560 us/op | 11.335 us/op | 2.70 | | transfer serialized Avg SignedBeaconBlock (200000 B) | 4.0280 us/op | 3.8590 us/op | 1.04 | | copy serialized Avg SignedBeaconBlock (200000 B) | 42.904 us/op | 16.335 us/op | 2.63 | | transfer serialized BlobsSidecar (524380 B) | 5.0530 us/op | 3.8800 us/op | 1.30 | | copy serialized BlobsSidecar (524380 B) | 129.11 us/op | 81.961 us/op | 1.58 | | transfer serialized Big SignedBeaconBlock (1000000 B) | 4.2910 us/op | 3.4240 us/op | 1.25 | | copy serialized Big SignedBeaconBlock (1000000 B) | 208.43 us/op | 366.92 us/op | 0.57 | | pass gossip attestations to forkchoice per slot | 2.9794 ms/op | 2.8942 ms/op | 1.03 | | forkChoice updateHead vc 100000 bc 64 eq 0 | 579.69 us/op | 613.26 us/op | 0.95 | | forkChoice updateHead vc 600000 bc 64 eq 0 | 3.8773 ms/op | 2.5123 ms/op | 1.54 | | forkChoice updateHead vc 1000000 bc 64 eq 0 | 5.8805 ms/op | 4.1537 ms/op | 1.42 | | forkChoice updateHead vc 600000 bc 320 eq 0 | 3.3787 ms/op | 2.5116 ms/op | 1.35 | | forkChoice updateHead vc 600000 bc 1200 eq 0 | 3.4155 ms/op | 2.6132 ms/op | 1.31 | | forkChoice updateHead vc 600000 bc 7200 eq 0 | 4.7502 ms/op | 3.0073 ms/op | 1.58 | | forkChoice updateHead vc 600000 bc 64 eq 1000 | 11.446 ms/op | 9.9439 ms/op | 1.15 | | forkChoice updateHead vc 600000 bc 64 eq 10000 | 12.012 ms/op | 9.5763 ms/op | 1.25 | | forkChoice updateHead vc 600000 bc 64 eq 300000 | 18.114 ms/op | 11.895 ms/op | 1.52 | | computeDeltas 500000 validators 300 proto nodes | 3.9334 ms/op | 2.9949 ms/op | 1.31 | | computeDeltas 500000 validators 1200 proto nodes | 4.1718 ms/op | 2.8998 ms/op | 1.44 | | computeDeltas 500000 validators 7200 proto nodes | 4.7012 ms/op | 2.9031 ms/op | 1.62 | | computeDeltas 750000 validators 300 proto nodes | 6.7033 ms/op | 4.2027 ms/op | 1.59 | | computeDeltas 750000 validators 1200 proto nodes | 7.2961 ms/op | 4.2632 ms/op | 1.71 | | computeDeltas 750000 validators 7200 proto nodes | 6.6749 ms/op | 4.2885 ms/op | 1.56 | | computeDeltas 1400000 validators 300 proto nodes | 13.743 ms/op | 8.3256 ms/op | 1.65 | | computeDeltas 1400000 validators 1200 proto nodes | 11.964 ms/op | 8.3951 ms/op | 1.43 | | computeDeltas 1400000 validators 7200 proto nodes | 11.602 ms/op | 8.2188 ms/op | 1.41 | | computeDeltas 2100000 validators 300 proto nodes | 19.614 ms/op | 12.432 ms/op | 1.58 | | computeDeltas 2100000 validators 1200 proto nodes | 18.219 ms/op | 12.772 ms/op | 1.43 | | computeDeltas 2100000 validators 7200 proto nodes | 18.552 ms/op | 12.687 ms/op | 1.46 | | altair processAttestation - 250000 vs - 7PWei normalcase | 3.0122 ms/op | 2.4128 ms/op | 1.25 | | altair processAttestation - 250000 vs - 7PWei worstcase | 3.8183 ms/op | 2.1688 ms/op | 1.76 | | altair processAttestation - setStatus - 1/6 committees join | 144.70 us/op | 71.702 us/op | 2.02 | | altair processAttestation - setStatus - 1/3 committees join | 259.79 us/op | 295.54 us/op | 0.88 | | altair processAttestation - setStatus - 1/2 committees join | 305.52 us/op | 180.42 us/op | 1.69 | | altair processAttestation - setStatus - 2/3 committees join | 398.43 us/op | 383.77 us/op | 1.04 | | altair processAttestation - setStatus - 4/5 committees join | 580.21 us/op | 377.16 us/op | 1.54 | | altair processAttestation - setStatus - 100% committees join | 677.12 us/op | 461.75 us/op | 1.47 | | altair processBlock - 250000 vs - 7PWei normalcase | 9.1927 ms/op | 4.6896 ms/op | 1.96 | | altair processBlock - 250000 vs - 7PWei normalcase hashState | 30.536 ms/op | 26.495 ms/op | 1.15 | | altair processBlock - 250000 vs - 7PWei worstcase | 42.396 ms/op | 33.838 ms/op | 1.25 | | altair processBlock - 250000 vs - 7PWei worstcase hashState | 107.53 ms/op | 64.807 ms/op | 1.66 | | phase0 processBlock - 250000 vs - 7PWei normalcase | 3.4799 ms/op | 1.7563 ms/op | 1.98 | | phase0 processBlock - 250000 vs - 7PWei worstcase | 29.460 ms/op | 21.313 ms/op | 1.38 | | altair processEth1Data - 250000 vs - 7PWei normalcase | 641.65 us/op | 233.09 us/op | 2.75 | | getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 | 9.8280 us/op | 4.4050 us/op | 2.23 | | getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 | 50.453 us/op | 28.457 us/op | 1.77 | | getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 | 15.399 us/op | 7.7580 us/op | 1.98 | | getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 | 12.190 us/op | 5.2310 us/op | 2.33 | | getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 | 226.31 us/op | 124.94 us/op | 1.81 | | getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 | 1.6678 ms/op | 854.68 us/op | 1.95 | | getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 | 2.4897 ms/op | 1.0833 ms/op | 2.30 | | getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 | 2.5073 ms/op | 1.1467 ms/op | 2.19 | | getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 | 6.6275 ms/op | 2.7043 ms/op | 2.45 | | getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 | 3.1488 ms/op | 1.1882 ms/op | 2.65 | | getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 | 9.0531 ms/op | 2.7366 ms/op | 3.31 | | Tree 40 250000 create | 789.25 ms/op | 176.86 ms/op | 4.46 | | Tree 40 250000 get(125000) | 246.75 ns/op | 106.59 ns/op | 2.31 | | Tree 40 250000 set(125000) | 3.0896 us/op | 512.85 ns/op | 6.02 | | Tree 40 250000 toArray() | 38.438 ms/op | 11.794 ms/op | 3.26 | | Tree 40 250000 iterate all - toArray() + loop | 35.901 ms/op | 12.681 ms/op | 2.83 | | Tree 40 250000 iterate all - get(i) | 87.012 ms/op | 40.093 ms/op | 2.17 | | Array 250000 create | 5.9009 ms/op | 2.5364 ms/op | 2.33 | | Array 250000 clone - spread | 6.0579 ms/op | 1.2515 ms/op | 4.84 | | Array 250000 get(125000) | 0.61100 ns/op | 0.56300 ns/op | 1.09 | | Array 250000 set(125000) | 0.73300 ns/op | 0.60700 ns/op | 1.21 | | Array 250000 iterate all - loop | 107.49 us/op | 77.053 us/op | 1.40 | | phase0 afterProcessEpoch - 250000 vs - 7PWei | 129.24 ms/op | 74.660 ms/op | 1.73 | | Array.fill - length 1000000 | 8.3394 ms/op | 2.4610 ms/op | 3.39 | | Array push - length 1000000 | 52.419 ms/op | 14.302 ms/op | 3.67 | | Array.get | 0.38856 ns/op | 0.25061 ns/op | 1.55 | | Uint8Array.get | 0.49877 ns/op | 0.33016 ns/op | 1.51 | | phase0 beforeProcessEpoch - 250000 vs - 7PWei | 26.913 ms/op | 16.156 ms/op | 1.67 | | altair processEpoch - mainnet_e81889 | 440.49 ms/op | 282.40 ms/op | 1.56 | | mainnet_e81889 - altair beforeProcessEpoch | 27.008 ms/op | 18.445 ms/op | 1.46 | | mainnet_e81889 - altair processJustificationAndFinalization | 20.776 us/op | 9.7680 us/op | 2.13 | | mainnet_e81889 - altair processInactivityUpdates | 9.1003 ms/op | 4.1268 ms/op | 2.21 | | mainnet_e81889 - altair processRewardsAndPenalties | 60.565 ms/op | 46.934 ms/op | 1.29 | | mainnet_e81889 - altair processRegistryUpdates | 3.7280 us/op | 1.9450 us/op | 1.92 | | mainnet_e81889 - altair processSlashings | 1.1940 us/op | 737.00 ns/op | 1.62 | | mainnet_e81889 - altair processEth1DataReset | 634.00 ns/op | 684.00 ns/op | 0.93 | | mainnet_e81889 - altair processEffectiveBalanceUpdates | 2.0765 ms/op | 992.58 us/op | 2.09 | | mainnet_e81889 - altair processSlashingsReset | 7.8820 us/op | 2.1260 us/op | 3.71 | | mainnet_e81889 - altair processRandaoMixesReset | 8.2580 us/op | 2.5490 us/op | 3.24 | | mainnet_e81889 - altair processHistoricalRootsUpdate | 1.2110 us/op | 664.00 ns/op | 1.82 | | mainnet_e81889 - altair processParticipationFlagUpdates | 6.8340 us/op | 1.5560 us/op | 4.39 | | mainnet_e81889 - altair processSyncCommitteeUpdates | 1.0150 us/op | 651.00 ns/op | 1.56 | | mainnet_e81889 - altair afterProcessEpoch | 111.24 ms/op | 72.806 ms/op | 1.53 | | capella processEpoch - mainnet_e217614 | 1.6894 s/op | 1.1971 s/op | 1.41 | | mainnet_e217614 - capella beforeProcessEpoch | 131.07 ms/op | 64.615 ms/op | 2.03 | | mainnet_e217614 - capella processJustificationAndFinalization | 32.572 us/op | 12.091 us/op | 2.69 | | mainnet_e217614 - capella processInactivityUpdates | 23.018 ms/op | 11.824 ms/op | 1.95 | | mainnet_e217614 - capella processRewardsAndPenalties | 339.30 ms/op | 258.78 ms/op | 1.31 | | mainnet_e217614 - capella processRegistryUpdates | 23.766 us/op | 11.294 us/op | 2.10 | | mainnet_e217614 - capella processSlashings | 1.0620 us/op | 779.00 ns/op | 1.36 | | mainnet_e217614 - capella processEth1DataReset | 794.00 ns/op | 753.00 ns/op | 1.05 | | mainnet_e217614 - capella processEffectiveBalanceUpdates | 22.079 ms/op | 3.2673 ms/op | 6.76 | | mainnet_e217614 - capella processSlashingsReset | 5.9620 us/op | 1.4090 us/op | 4.23 | | mainnet_e217614 - capella processRandaoMixesReset | 8.4240 us/op | 3.0300 us/op | 2.78 | | mainnet_e217614 - capella processHistoricalRootsUpdate | 2.0750 us/op | 698.00 ns/op | 2.97 | | mainnet_e217614 - capella processParticipationFlagUpdates | 4.1860 us/op | 1.7280 us/op | 2.42 | | mainnet_e217614 - capella afterProcessEpoch | 329.58 ms/op | 187.65 ms/op | 1.76 | | phase0 processEpoch - mainnet_e58758 | 513.25 ms/op | 339.02 ms/op | 1.51 | | mainnet_e58758 - phase0 beforeProcessEpoch | 138.59 ms/op | 86.057 ms/op | 1.61 | | mainnet_e58758 - phase0 processJustificationAndFinalization | 33.405 us/op | 11.171 us/op | 2.99 | | mainnet_e58758 - phase0 processRewardsAndPenalties | 53.938 ms/op | 36.631 ms/op | 1.47 | | mainnet_e58758 - phase0 processRegistryUpdates | 18.338 us/op | 6.1400 us/op | 2.99 | | mainnet_e58758 - phase0 processSlashings | 1.0240 us/op | 763.00 ns/op | 1.34 | | mainnet_e58758 - phase0 processEth1DataReset | 1.0750 us/op | 718.00 ns/op | 1.50 | | mainnet_e58758 - phase0 processEffectiveBalanceUpdates | 4.9836 ms/op | 1.3628 ms/op | 3.66 | | mainnet_e58758 - phase0 processSlashingsReset | 8.8970 us/op | 2.8190 us/op | 3.16 | | mainnet_e58758 - phase0 processRandaoMixesReset | 10.465 us/op | 2.9030 us/op | 3.60 | | mainnet_e58758 - phase0 processHistoricalRootsUpdate | 814.00 ns/op | 654.00 ns/op | 1.24 | | mainnet_e58758 - phase0 processParticipationRecordUpdates | 7.0880 us/op | 2.7200 us/op | 2.61 | | mainnet_e58758 - phase0 afterProcessEpoch | 92.413 ms/op | 61.266 ms/op | 1.51 | | phase0 processEffectiveBalanceUpdates - 250000 normalcase | 2.2452 ms/op | 937.30 us/op | 2.40 | | phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 | 7.4445 ms/op | 1.4955 ms/op | 4.98 | | altair processInactivityUpdates - 250000 normalcase | 20.285 ms/op | 16.347 ms/op | 1.24 | | altair processInactivityUpdates - 250000 worstcase | 18.877 ms/op | 15.919 ms/op | 1.19 | | phase0 processRegistryUpdates - 250000 normalcase | 12.683 us/op | 5.1620 us/op | 2.46 | | phase0 processRegistryUpdates - 250000 badcase_full_deposits | 314.04 us/op | 289.32 us/op | 1.09 | | phase0 processRegistryUpdates - 250000 worstcase 0.5 | 139.80 ms/op | 107.16 ms/op | 1.30 | | altair processRewardsAndPenalties - 250000 normalcase | 40.815 ms/op | 43.265 ms/op | 0.94 | | altair processRewardsAndPenalties - 250000 worstcase | 43.584 ms/op | 36.257 ms/op | 1.20 | | phase0 getAttestationDeltas - 250000 normalcase | 9.3365 ms/op | 5.8463 ms/op | 1.60 | | phase0 getAttestationDeltas - 250000 worstcase | 8.0951 ms/op | 6.3152 ms/op | 1.28 | | phase0 processSlashings - 250000 worstcase | 112.07 us/op | 80.421 us/op | 1.39 | | altair processSyncCommitteeUpdates - 250000 | 139.00 ms/op | 100.38 ms/op | 1.38 | | BeaconState.hashTreeRoot - No change | 235.00 ns/op | 448.00 ns/op | 0.52 | | BeaconState.hashTreeRoot - 1 full validator | 108.76 us/op | 143.56 us/op | 0.76 | | BeaconState.hashTreeRoot - 32 full validator | 1.7075 ms/op | 1.4851 ms/op | 1.15 | | BeaconState.hashTreeRoot - 512 full validator | 18.744 ms/op | 15.293 ms/op | 1.23 | | BeaconState.hashTreeRoot - 1 validator.effectiveBalance | 177.73 us/op | 140.58 us/op | 1.26 | | BeaconState.hashTreeRoot - 32 validator.effectiveBalance | 2.2769 ms/op | 1.8417 ms/op | 1.24 | | BeaconState.hashTreeRoot - 512 validator.effectiveBalance | 32.727 ms/op | 24.501 ms/op | 1.34 | | BeaconState.hashTreeRoot - 1 balances | 142.68 us/op | 103.70 us/op | 1.38 | | BeaconState.hashTreeRoot - 32 balances | 1.4604 ms/op | 975.06 us/op | 1.50 | | BeaconState.hashTreeRoot - 512 balances | 10.262 ms/op | 9.9586 ms/op | 1.03 | | BeaconState.hashTreeRoot - 250000 balances | 236.26 ms/op | 170.44 ms/op | 1.39 | | aggregationBits - 2048 els - zipIndexesInBitList | 37.502 us/op | 40.991 us/op | 0.91 | | byteArrayEquals 32 | 58.588 ns/op | 45.537 ns/op | 1.29 | | Buffer.compare 32 | 19.772 ns/op | 14.669 ns/op | 1.35 | | byteArrayEquals 1024 | 1.7401 us/op | 1.1987 us/op | 1.45 | | Buffer.compare 1024 | 27.586 ns/op | 22.981 ns/op | 1.20 | | byteArrayEquals 16384 | 27.053 us/op | 19.059 us/op | 1.42 | | Buffer.compare 16384 | 219.56 ns/op | 186.51 ns/op | 1.18 | | byteArrayEquals 123687377 | 207.05 ms/op | 144.50 ms/op | 1.43 | | Buffer.compare 123687377 | 12.803 ms/op | 3.7482 ms/op | 3.42 | | byteArrayEquals 32 - diff last byte | 59.357 ns/op | 45.659 ns/op | 1.30 | | Buffer.compare 32 - diff last byte | 22.456 ns/op | 14.990 ns/op | 1.50 | | byteArrayEquals 1024 - diff last byte | 1.7447 us/op | 1.2008 us/op | 1.45 | | Buffer.compare 1024 - diff last byte | 33.399 ns/op | 21.624 ns/op | 1.54 | | byteArrayEquals 16384 - diff last byte | 27.295 us/op | 19.065 us/op | 1.43 | | Buffer.compare 16384 - diff last byte | 298.87 ns/op | 166.93 ns/op | 1.79 | | byteArrayEquals 123687377 - diff last byte | 202.91 ms/op | 144.16 ms/op | 1.41 | | Buffer.compare 123687377 - diff last byte | 8.8015 ms/op | 5.2920 ms/op | 1.66 | | byteArrayEquals 32 - random bytes | 5.9860 ns/op | 4.7280 ns/op | 1.27 | | Buffer.compare 32 - random bytes | 29.824 ns/op | 15.032 ns/op | 1.98 | | byteArrayEquals 1024 - random bytes | 6.7850 ns/op | 4.7320 ns/op | 1.43 | | Buffer.compare 1024 - random bytes | 24.598 ns/op | 14.850 ns/op | 1.66 | | byteArrayEquals 16384 - random bytes | 6.3050 ns/op | 4.7650 ns/op | 1.32 | | Buffer.compare 16384 - random bytes | 18.801 ns/op | 14.833 ns/op | 1.27 | | byteArrayEquals 123687377 - random bytes | 7.8300 ns/op | 7.5300 ns/op | 1.04 | | Buffer.compare 123687377 - random bytes | 23.560 ns/op | 17.610 ns/op | 1.34 | | regular array get 100000 times | 46.806 us/op | 29.502 us/op | 1.59 | | wrappedArray get 100000 times | 43.426 us/op | 29.508 us/op | 1.47 | | arrayWithProxy get 100000 times | 14.334 ms/op | 10.295 ms/op | 1.39 | | ssz.Root.equals | 48.552 ns/op | 43.730 ns/op | 1.11 | | byteArrayEquals | 48.616 ns/op | 43.341 ns/op | 1.12 | | Buffer.compare | 11.225 ns/op | 9.1030 ns/op | 1.23 | | shuffle list - 16384 els | 7.0266 ms/op | 5.4516 ms/op | 1.29 | | shuffle list - 250000 els | 100.26 ms/op | 80.723 ms/op | 1.24 | | processSlot - 1 slots | 14.354 us/op | 13.893 us/op | 1.03 | | processSlot - 32 slots | 3.4117 ms/op | 2.7640 ms/op | 1.23 | | getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei | 45.131 ms/op | 43.182 ms/op | 1.05 | | getCommitteeAssignments - req 1 vs - 250000 vc | 2.2412 ms/op | 1.8163 ms/op | 1.23 | | getCommitteeAssignments - req 100 vs - 250000 vc | 4.3563 ms/op | 3.5448 ms/op | 1.23 | | getCommitteeAssignments - req 1000 vs - 250000 vc | 4.7134 ms/op | 3.8255 ms/op | 1.23 | | findModifiedValidators - 10000 modified validators | 313.38 ms/op | 231.90 ms/op | 1.35 | | findModifiedValidators - 1000 modified validators | 223.83 ms/op | 153.61 ms/op | 1.46 | | findModifiedValidators - 100 modified validators | 202.98 ms/op | 144.33 ms/op | 1.41 | | findModifiedValidators - 10 modified validators | 238.33 ms/op | 138.04 ms/op | 1.73 | | findModifiedValidators - 1 modified validators | 298.90 ms/op | 129.35 ms/op | 2.31 | | findModifiedValidators - no difference | 281.21 ms/op | 124.53 ms/op | 2.26 | | compare ViewDUs | 3.8588 s/op | 3.2207 s/op | 1.20 | | compare each validator Uint8Array | 1.8521 s/op | 1.6729 s/op | 1.11 | | compare ViewDU to Uint8Array | 1.5219 s/op | 679.23 ms/op | 2.24 | | migrate state 1000000 validators, 24 modified, 0 new | 1.2387 s/op | 834.81 ms/op | 1.48 | | migrate state 1000000 validators, 1700 modified, 1000 new | 1.3509 s/op | 1.0836 s/op | 1.25 | | migrate state 1000000 validators, 3400 modified, 2000 new | 1.7839 s/op | 1.2836 s/op | 1.39 | | migrate state 1500000 validators, 24 modified, 0 new | 1.2185 s/op | 853.07 ms/op | 1.43 | | migrate state 1500000 validators, 1700 modified, 1000 new | 1.6077 s/op | 1.0885 s/op | 1.48 | | migrate state 1500000 validators, 3400 modified, 2000 new | 2.1227 s/op | 1.2930 s/op | 1.64 | | RootCache.getBlockRootAtSlot - 250000 vs - 7PWei | 5.6000 ns/op | 5.9300 ns/op | 0.94 | | state getBlockRootAtSlot - 250000 vs - 7PWei | 745.66 ns/op | 969.78 ns/op | 0.77 | | computeProposers - vc 250000 | 9.5096 ms/op | 6.1903 ms/op | 1.54 | | computeEpochShuffling - vc 250000 | 130.09 ms/op | 75.646 ms/op | 1.72 | | getNextSyncCommittee - vc 250000 | 177.92 ms/op | 102.17 ms/op | 1.74 | | computeSigningRoot for AttestationData | 30.781 us/op | 22.530 us/op | 1.37 | | hash AttestationData serialized data then Buffer.toString(base64) | 2.1211 us/op | 1.1187 us/op | 1.90 | | toHexString serialized data | 1.8387 us/op | 713.84 ns/op | 2.58 | | Buffer.toString(base64) | 260.80 ns/op | 131.73 ns/op | 1.98 | | nodejs block root to RootHex using toHex | 231.73 ns/op | 103.20 ns/op | 2.25 | | nodejs block root to RootHex using toRootHex | 150.02 ns/op | 68.267 ns/op | 2.20 | | browser block root to RootHex using the deprecated toHexString | 460.36 ns/op | 194.57 ns/op | 2.37 | | browser block root to RootHex using toHex | 326.74 ns/op | 157.38 ns/op | 2.08 | | browser block root to RootHex using toRootHex | 301.26 ns/op | 137.96 ns/op | 2.18 |

by benchmarkbot/action

codecov[bot] commented 1 week ago

Codecov Report

Attention: Patch coverage is 46.15385% with 7 lines in your changes missing coverage. Please review.

Project coverage is 50.82%. Comparing base (cd98c23) to head (0edc38f). Report is 13 commits behind head on unstable.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## unstable #7091 +/- ## ============================================ - Coverage 50.84% 50.82% -0.02% ============================================ Files 597 597 Lines 39835 39827 -8 Branches 2069 2060 -9 ============================================ - Hits 20256 20244 -12 - Misses 19579 19583 +4 ```
twoeths commented 1 week ago

gc has been reduced with this PR. I guess that's because we don't have to convert to string for the map thanks to the default trait implementation Rust, this is amazing @wemeetagain

this is on a mainnet node

Screenshot 2024-09-24 at 14 17 49
wemeetagain commented 5 days ago

Only thing we need to ensure is that @chainsafe/pubkey-index-map supports all the necessary platforms we want to support

twoeths commented 4 days ago

Only thing we need to ensure is that @chainsafe/pubkey-index-map supports all the necessary platforms we want to support

my understand is @chainsafe/pubkey-index-map does not have native dependencies like in @chainsafe/blst or @chainsafe/hashtree so perhaps we're safe

would like @matthewkeil to confirm this, this is the same situation to the napi-rs work we're gonna do for epoch shuffling computation

matthewkeil commented 1 day ago

Only thing we need to ensure is that @chainsafe/pubkey-index-map supports all the necessary platforms we want to support

my understand is @chainsafe/pubkey-index-map does not have native dependencies like in @chainsafe/blst or @chainsafe/hashtree so perhaps we're safe

would like @matthewkeil to confirm this, this is the same situation to the napi-rs work we're gonna do for epoch shuffling computation

Platforms look ok. I would think we do not want to support musl because of the performance aspect but not sure how important that is relative to having "support for everything"... 🤷‍♂️ I think its ok as is though.

twoeths commented 22 hours ago

metrics after ~9h of deployment on the unstable mainnet node

Screenshot 2024-10-01 at 09 18 35

both heap and gc are improved a little bit