ethereum / portal-network-specs

Official repository for specifications for the Portal Network
316 stars 85 forks source link

Using epochs to eliminate hot-spots for path-aware content-ids for state storage. #289

Closed pipermerriam closed 5 months ago

pipermerriam commented 7 months ago

Previously we explored a concept of making content-id's in the state network be "path aware"

https://ethresear.ch/t/distributing-ethereum-state-over-portal-network/17882

This idea was eventually discarded due to problems with it creating hot spots in the network.

I'm curious now whether it is possible to smooth out these hot spots. The high level idea is to use an epoch based solution to continually shift the storage location in the DHT key-space which would place firm upper bounds on how much data would be located in any given hot spot.

It's not currently clear how exactly this keying solution would work because at present we are keying based on the hash of the trie node. This new solution would need to take block-number into account. The two keying approaches seem to be incompatible since a trie node can exist across any number of state roots....

morph-dev commented 7 months ago

I have been thinking about exactly this for quite some time, but I didn't manage to make it work.

Technically, it's possible to anchor content based on their path and "epoch" (not hash). However, some (most likely many) trie nodes live across many epochs and we would have to replicate them even if they don't change. In fact, every trie node (except the ones that change exactly at the epoch boundary) would be at least duplicated. This would probably end up needing more space than the current naive approach. This issue can be mitigated to a degree by increasing the "epoch" window, but that makes the "hot-spot" issue worse.

The ideal solution would be to spread them out, not by epoch, but by how frequently they are updated. Some kind of dynamic "epoch" that can be different for every trie node (there is still some relationship, e.g. parent node is at least as frequently updated as any of its children). But I didn't find a way to do it.

pipermerriam commented 7 months ago

Ok, I think if we remove the idea of using this for all of history and we treat HEAD state and archive state separately this just works (which also is inline with the original designs for state network).

Original plan was to store two separate copies of the state.

pipermerriam commented 5 months ago

Closing this as there is nothing actionable at this time.