perf(continuations): Reuse blobs in segment `RlpRaw`

0xPolygonZero / zk_evm

Apache License 2.0

77 stars 32 forks source link

perf(continuations): Reuse blobs in segment `RlpRaw` #452

Closed hratoanina closed 1 week ago

hratoanina commented 1 month ago

One of the biggest segments in the memory is RlpRaw (during MPT hashing), which is used to encode MPT nodes in RLP format. This segment grows quite large, both in term of Rust size (as a very big sparse vector) and in term of Memory rows (since previous nodes are kept forever in memory in their blob). By changing the MPT hashing functions in the kernel, we could reuse the same blob by overwriting its content when we can, which would make the segment much smaller. This would make segment generation quicker and make MemBefore and MemAfter smaller.

Nashtare commented 1 month ago

Ah yeah that's nice, I believe we're already applying a similar logic by overwriting the txn portion of the RlpRaw segment when performing a multi-txn batch.

How would the reuse mechanism work though? Are you thinking on only overwriting the RlpRaw segment, or having a more efficient approach of reusing encoded nodes that could for instance appear in both initial / final tries? I'm not sure how the flagging would work for the latter.

hratoanina commented 1 month ago

The goal is, if possible, to get rid of blobs altogether and to only use the beginning of the segment for every encoding. I added "when we can" to keep the option open of having an intermediate approach where we keep some blobs, but not as many as currently (currently: basically one blob per node).