Closed twoeths closed 1 year ago
Unrelated but important, noticed that on 0525_lg1k_v1.1.1.cpuprofile.zip afterProcessEpoch is run 3 times between 250500 ms and 255500 ms (~5s range)
v1.2.0 process more attestation, looking into the number of aggregateAttestationInto
calls it's 3.3x compared to v1.1
v1.1.0
in both nodes it receives the same number of valid attestations per second
so it could be v1.2.0 receives more attestations in the last 3 slots that causes us to do the preaggregation 3x times
@tuyennhv I've been looking into how to improve this:
@tuyennhv I've been looking into how to improve this:
- We can not cache "de-serialized" signatures since they are sent to the Workers parsed
- The only trade-off we can do I think is to aggregate the signatures lazily at aggregation time. WIP at https://github.com/ChainSafe/lodestar/compare/dapplion/aggregate-pool-fromBytes
this could help improve the I/O lag issue a bit as doing preaggregate at 0 - 1/3 of slot would make us more busy, let's try doing the aggregation at 2/3 of slot 👍
if we use multistream-select 3.1.1, seems like the p2p are improved and we receive so many attestations which cause fromBytes
to take 22% of cpu time
0111_multi_stream_select_3.1.1_lg1k.cpuprofile.zip
maybe attestations are aggregated too much which cause fromBytes
to run so frequently and our peers are out of mesh then aggregated rates are dropped
@tuyennhv Closing for now as https://github.com/ChainSafe/lodestar/pull/4838 should make fromBytes calls much cheaper. Once you do another CPU profile in the future please confirm
Describe the bug
With v1.2.0, on a node of 1000 keys,
fromBytes()
takes more than 8% of cpu time1025_lg1k_chacha20poly1305_no_mem_alloc.cpuprofile.zip
Expected behavior
In v1.1.1, it takes only around 1% with the same type of node
0525_lg1k_v1.1.1.cpuprofile.zip