Open kdeme opened 2 months ago
Looks like storing all 8192 of the most recent headers is about 5mb.
Seems we could support range queries by having a separate content key that:
content-id
as the header hash it is anchored to.ancestor_depth
field that is used to specify how many additional historical headers the client wants.In a network situation where clients are expected to store most or all of the recent headers, this could be used to quickly acquire all of the most recent headers.
One way to do the content-id for recent headers might be to have the derived content-id
be based on the block number. It takes 13 bits for 8192 values, so we could have a static 13 bits based on the block_height % 8192
so that blocks at certain heights always had the same most significant 13 bits and then the remaining are random from sha256(block_hash)
.
If my intuition is correct, this would result in two blocks that are close to each other in height, also being close to each other in the network, making it somewhat easy to navigate the network to grab sequential blocks from the recent set.
If we said that all nodes had to store all headers it would introduce the first real baseline "requirement" for nodes in our network, meaning that they'd be required to both store ~5mb of data and they would have to continually acquire new headers as the entered the network.
My initial gut says that I don't like introducing this as a requirement and that maybe I'd rather it be optional. Here are some ideas.
custom_data
field in our Ping/Pong message which indicates how many recent headers you store.hash
and number
so that we can fetch them sequentially by hash or in bulk using the block number.The use cases I'm thinking we want to support are:
A Portal node that is running the Portal beacon network and has a beacon light client synced will also have access to these headers as they are part of the
LightClientHeader
since Capella (albeit in a different serialization): https://github.com/ethereum/consensus-specs/blob/7cacee6ad64483357a7332be6a11784de1242428/specs/capella/light-client/sync-protocol.md?plain=1#L52
Are we going to store in db the recent 8192 ExecutionPayloadHeader
s and provide those on request within the current period? I'm not sure what is exactly the difference between ExecutionPayloadHeader
and EL header but are we going to miss some important data fields if we provide only ExecutionPayloadHeader
s for the last ~27 hours?
My initial gut says that I don't like introducing this as a requirement and that maybe I'd rather it be optional.
We are already doing this by storing all bootstraps and LightClientUpdate
s for the weak subjectivity period (~4months) but I agree that it is better to make this new requirement optional.
We are already doing this by storing all bootstraps and LightClientUpdates for the weak subjectivity period (~4months) but I agree that it is better to make this new requirement optional.
Is this opt-in? Is it assumed that you can request any of these from any node on the network? Are nodes on the network expected to sync the last 27 hours of these when they come online? I think this is what I mean by making it optional.
It's very different for a client to choose to store the last 27 hours of these vs a client to be expected to have the last 27 hours of these and for functionality in the network to be based on the assumption that they can request the last 27 hours of these from any node on the network.
Is this opt-in? Is it assumed that you can request any of these from any node on the network?
For LightClientBootstrap
s, we currently expect all clients to store all bootstraps. The idea is to make this content available and provide all trusted block roots in our network for the last ~4 months. The user then can choose any trusted block root as a starting point to sync the light client.
Regarding LightClientUpdate
s, we push them into the network at the end of every period (~27 hours) and expect every node to store them, this is a requirement for the light client sync protocol to jump from the trusted block root to the current period and to start following the chain.
So I think everything depends on what kind of flexibility we want to provide for the end user for choosing their starting point to sync the light client.
Use some scheme like https://github.com/ethereum/portal-network-specs/issues/336#issuecomment-2358486268 to give recent headers a predictable address in the network.
Doesn't this idea introduce a roving hot spot in the network (on the assumption that recent chain history is the most popular) centered around the nodes "nearest" to the current 13 most significant bits of the contentIDs? I get conceptually why its convenient for purposes of retrieval but feels like it could end up being a DDoS vector once we get an uptick in the "very light" clients you mentioned that are regularly dropping in and out and end up hammering the same set of nodes looking for the head of the chain.
Good catch on the hot spots. Possibly we could eliminate the hotspot by expecting all nodes to store the latest 256 headers, and then stripe the rest around the network. Grabbing 256 headers in one request from a close neighbor should be pretty trivial in terms of bandwidth costs.
Still needs to be decided if the striping approach actually fixes a real problem and is necessary...
If we decide that nodes SHOULD store 8192 recent headers then I think the simple solution that is currently done for the LightClientUpdate
type in the Portal beacon network could work also here.
One LightClientUpdate
is 26813 bytes
A request allows a range of 128
currently -> ~3.5MB over 1 request is possible
But not all nodes will have the full 4 month range, it depends on when it was bootstrapped.
Instead of a neighborhood requests, random requests are done -> some might fail, that should be fine however.
Random requests means that ContentId
is practically not used (Not using the DHT its main feature here).
The content key is by start_period
+ amount
Within Fluffy this is stored with the period
as index, as it only needs to be retrieved by period. Pruning is easy as we can just delete anything with an period older than x.
This does not mean that every node that comes online needs to store all this data immediately. As long as "ephemeral" nodes are not a massive majority, this should be fine I think.
(Even in the case of localized storage/access this could/would still be an issue, but to a lesser extent)
The idea of @pipermerriam of adding in ping.custom_data
the amount of headers stored would also help here.
When a node stores all ~5MB of recent headers, it could store the headers in a specialized table so that it can easily retrieve them via block number and via block hash (e.g. primary key hash, index on number or similar ways).
The protocol could then provide two different content keys, but they access the same data on the nodes. And the content keys could/would support ranges.
Pruning could work by dropping block numbers older than x. (There is a slight discrepancy with block number vs slot, as not each slot necessarily has a block, so perhaps it is a little more complex, but then again it doesn't need to be exactly the last 8192 slots i think)
Now, if for some reason this is not sufficient and too flawed causing issues retrieving this data, then yes, we will have to resort to something more complicated in terms of content id derivation and localized access to the data (as has been mentioned in above comments). I'm however questioning if this will be needed.
Are we going to store in db the recent 8192
ExecutionPayloadHeader
s and provide those on request within the current period?
I think all the necessary fields to convert to an EL BlockHeader
are there. What we store in the end probably does not matter that much, but considering it is EL
history network, an EL BlockHeader
perhaps makes more sense.
We are already doing this by storing all bootstraps and
LightClientUpdate
s for the weak subjectivity period (~4months) but I agree that it is better to make this new requirement optional.
Yes, but I would say the LightClientUpdate
s are a better example. As they are accessible by range and of similar total size, see comment above.
In the last Portal meetup I actually mentioned that I would be pro to moving the LightClientBootstrap
s to distribute storage over the network, considering their total size if stored for each epoch (I forgot I was going to make a tracking issue for this.):
Stored for MIN_EPOCHS_FOR_BLOCK_REQUESTS
(~4 months) result in ~850MB.
Compared to ~3.5MB for LightClientUpdate
s.
custom_data
field from the Ping and Pong messages.block.hash
and can request up to 256 recent headers.content-id
meaning that clients should generally not gossip this type or accept gossip of this type.Implications of this are:
None
type will be removed from the Union
for the current header content type. This means that clients will have to implement a migration path to fix the union selector byte for the types that are still valid and clients will need to remove any of the currently stored headers that lack a proof.
Recent BlockHeaders / Ephemeral BlockHeaders
EL
BlockHeader
s that are still in the currentperiod
cannot have a proof againsthistorical_summaries
as they are not part of the usedstate.block_roots
for the latestHistoricalSummary
.These can only be injected into the network with their proof once a period transition has occured.
Before this, they could get injected without a proof, as is currently done.
PR https://github.com/ethereum/portal-network-specs/pull/292 was/is a simple take on how to verify these.
However, the actual scenarios of how these headers get stored will probably be different.
A Portal node that is running the Portal beacon network and has a beacon light client synced will also have access to these headers as they are part of the
LightClientHeader
since Capella (albeit in a different serialization): https://github.com/ethereum/consensus-specs/blob/7cacee6ad64483357a7332be6a11784de1242428/specs/capella/light-client/sync-protocol.md?plain=1#L52Currently these recent / ephemeral (proof-less) BlockHeaders fall under the same content key as headers with a proof. It has been raised in the past to moving the BlockHeader without a proof into a seperate type in the history network. I think that is a good idea as they are conceptually different than headers with proof:
The effect of this is that:
All this will simply require different storage and access features.
Some example scenarios:
Portal node that is beacon LC synced:
Portal node that is not yet beacon LC synced:
Client with no Portal beacon network running (e.g. full node with Portal integrated)
Effect of changing to a new content type
The
None
option in the currentUnion
becomes invalid. However, removing theNone
would make all current data invalid. So if we want to clean this up properly, we need a migration path.Storing and accessing the data
Storage would be different than the current Content databases as it requires pruning and dealing with re-orgs.
It will thus more likely end up in a separate table / persistent cache but this is up to the implementation.
Access could exist as it does now, i.e. Neighborhood based look-ups, but with optionality of nodes to store more than its radius, and thus nodes could also try to request them to any node.
Or, we could make this explicit and say that each node MUST store all. Additionally, to "backfill" faster we could add in this implicit version a range request (this is similar as we do now in the beacon network for LightCLientUpdates)