ethereum / go-ethereum

Go implementation of the Ethereum protocol
https://geth.ethereum.org
GNU Lesser General Public License v3.0
47.29k stars 20.02k forks source link

Get rid of snap sync heal phase #27692

Open lightclient opened 1 year ago

lightclient commented 1 year ago

Although snap is great, there are a few classes of users that I don't think it serves the best:

1) Ethereum users with poor internet, but good hardware -- for these users it will take a while for the healing phase to complete 1) Ethereum power-users who need to sync new nodes and have powerful hardware capable of quickly recomputing state -- yes healing will be relatively fast, but likely not as fast as fully executing the blocks from a recent state checkpoint 1) Ethereum clones with high gas limits -- some of these churn state so much that most hardware / networks are insufficient to ever complete healing (#25965)

For these reasons, it might be interesting to allow users to opt-in to some type of warp-sync-like-scheme using snap.

The general idea would be to have a way to tell geth to stop at certain intervals to "preserve" the snap layers for that block and to also allow clients to pass a specific node id to snap sync from and avoid pivoting to a later block (so long as the target continues responding).

joohhnnn commented 1 year ago

like downloading number X's state, and executing the blocks from X to latest block for syncing instead of healing?

karalabe commented 1 year ago

It's not really simple to do. The snapshots are constantly moving with the chain. There's no way to freeze the snapshots, you'd need to stop and create a full copy of the entire 50-100GB thing, which can take arbitrarily long. You'd also need a new database "location" to store these frozen snapshots into.

lightclient commented 1 year ago

I think the idea would be more to pause the client at a certain height, that way we can continue to serve that height.

holiman commented 1 year ago

The general idea would be to have a way to tell geth to stop at certain intervals to "preserve" the snap layers for that block and to also allow clients to pass a specific node id to snap sync from and avoid pivoting to a later block (so long as the target continues responding).

Basically, you could have two geth-nodes: each would

And clients would sync against one of them. This would more or less "become" warp-sync, with minimal heal-phase after sync finished. But there are a lot of problems with this approach

Seems to me like there are a lot of small problems all over, that currently would make an attempt like this into a UX nightmare. And in the end, it would all culminate in it being possible/easy for someone to set up a centralized service to serve others. Which typically doesn't work well in the long term, since most parties would rather earn than spend money.

I'm not sure what actionable parts exist, but we can keep the discussion open for a bit, I guess.

karalabe commented 1 year ago

We've chatted a bit about this today at stabby and had the following ideas / steps:

First up, it is definitely possible even now to have a number of rotating seed nodes to sync from internally within an org, which pause every 6 hours for a day and then resume. In between new nodes can be spun up being pointed to those. That kind of works but is not useful in a public setting since it means the seeding org would need to foot the bill for the bandwidth + bottleneck for all.

An alternative proposal would be to drop the healing phase in favor of a witness request, where upon every pivot change we would fetch a witness to "fix up" the previously downloaded data.

The thing where this gets complicated is that if we need multiple requests to serve the witness (due to size), we need to transmit range proofs to verify the subtries fixed up, and with concurrently downloading account/storage trie segments, we also need range proofs corresponding to those regions.

All in all it's complicated, but probably worth it. @rjl493456442 will post some numbers to see how much bandwidth would be entail to fix up a single pivot move.

The open question however is what happens if someone stops their node for a longer time that the state retention period (128 blocks) as then no node in the network would be able to serve a witness from a "very old" pivot to a new one. In that case we might still need to fall back to healing, which would 1) prevent us from getting rid of it and 2) would hit exactly those people who have already been hit by healing (low bandwidth/high latency).

rjl493456442 commented 1 year ago

For a single pivot move( 64 blocks ), we need to fill up the state changes of these blocks. I dumped it out from a mainnet node, it turns out:

After aggregating them together

yilongli commented 1 year ago

What is wrong with the following strawman solution?

Suppose each node records the state diffs of the last N blocks as a log, it appears that the healing process can be simply implemented as fetching a range of state diffs (the starting position depends on the oldest piece of data received during snap sync) and applying them. This approach seems to be much simpler and more efficient than interactively exploring the state trie between two peers.

The state diffs can be compressed quite effectively, so each node can afford to keep long enough state diff history with just a fraction of the storage taken by the state. For example, it takes about 22 hours to download 100GB of state with a 10Mbps connection. OTOH, 48 hours' of state diffs are only a couple hundred MBs after removing dup writes to identical locations (one can probably compress them further using many tricks).

I am not familiar with the history of snap sync. so please forgive me if I am overlooking something obvious. Thanks!