Closed fynnss closed 1 month ago
This idea sounds reasonable, but unfortunately we have to keep the latest 128 states available for serving snap sync.
This idea sounds reasonable, but unfortunately we have to keep the latest 128 states available for serving snap sync.
AFAIK, snapshot data serves snap-sync. Will pathdb also be used? If it will be, maybe the history state can serve it.
Will pathdb also be used?
Yes, we need to construct the range proof for the specific state range, via trie
maybe the history state can serve it
The state history only records the flat state diffs, the historic trie node are not accessible unless we do the state rollback with the state diffs.
Yes, we need to construct the range proof for the specific state range, via trie
Thank you so much. I got it.
The snapshot use bloomfliter to reduce the call path of getAccount/Storage. But bloomfilter for path key seems not very suitable. Have you tried it before?
Yeah, we also thought about using a Bloom filter in a trie database, but we haven't tried it yet.
The reason is bloom filters need to be rebuilt for each layer once the dirty cache in "diskLayer" is flushed, we need to measure the cost and gain.
Given that the number of trie nodes contained in the trie database far exceeds the number of states contained in the state snapshot. We assume the cost for bloom filter construction is non-trivial.
Besides, we could foresee that the frequency of disk layer flushing is relatively higher than state snapshot (as the dirty cache size is limited). So the overhead for maintaining bloom filter could be even higher.
However, I think it would be worthwhile to experiment it a bit.
In some test scenarios, pbss 128difflayer may indeed be a potential bottleneck. Adding hashcache to difflayer may be an option. For details, see PR: https://github.com/ethereum/go-ethereum/pull/29991
@will-2012 Sounds more reasonable than bloomfilter with less overhead and better overall performance.
Rationale
Why should this feature exist?
In the existing implementation, pathdb's difflayer stores 128 layers by default, which can facilitate fast memory rollback. However, the more diff layers, the deeper the call depth and the worse performance of getting Node will be.
Use finalized block as the difflayer/dislayer indicator in pathdb can reduce the number of difflayers in memory and improve efficiency of getting node.
pprof:
Implementation
Do you have ideas regarding the implementation of this feature?
Are you willing to implement this feature?