After #10099, nodes are aware of the snapshots hosted by peers in the network. A node seeking to obtain a shard part can request it from any node which hosts a snapshot including the shard. We expect several hosts to be available for each shard.
We would like to implement a source selection strategy which distributes requests effectively across the network. Considering that state parts are expensive to generate but can be cached in memory, we generally aim to direct requests by different nodes for the same part to the same host.
We will implement selection based on rendezvous hashing. A priority P(v,i) = hash(v ++ i) will be defined for each source v and each part i. A node interested in obtaining a part will attempt to request it from the possible sources in order of their priority. By doing so we enable network nodes to reach a rough consensus on which nodes should serve which parts, without a need for perfect synchronization.
The SnapshotHostsCache should expose an API select_host(sync_hash: CryptoHash, sid: ShardId, pid: PartId) -> Option<PeerId>. The arguments specify a specific part and the function should select a peer from which to request the part. In case the same arguments are passed multiple times (likely due to failed attempts to obtain the part from the previously selected hosts), a different host should be selected on each call, going down the list of possibilities in their priority order.
Implementation details to consider include:
The exact choice of hash function. We don't really need any special properties here, just that it should be fast to compute and for any fixed set of source nodes, the permutations defined over the nodes by P(v,i) for different parts i should be different. Any reasonable choice should work here.
The SnapshotHostsCache will need to maintain some kind of internal state tracking which hosts were previously returned by calls to select_host. It should be reasonably efficient and should not grow without bound over time. lru::LruCache seems like a reasonable candidate.
The implementation should be robust to cases such as the set of snapshot hosts changing between calls to select_host.
It should be OK to implement something simple and effective for now, but in the future we may consider optimizations such as efficiently maintaining for each possible part id _i0 a traversable ordering over the set of nodes in the network in order of P(v, i_0) (for example in a balanced binary search tree).
After #10099, nodes are aware of the snapshots hosted by peers in the network. A node seeking to obtain a shard part can request it from any node which hosts a snapshot including the shard. We expect several hosts to be available for each shard.
We would like to implement a source selection strategy which distributes requests effectively across the network. Considering that state parts are expensive to generate but can be cached in memory, we generally aim to direct requests by different nodes for the same part to the same host.
We will implement selection based on rendezvous hashing. A priority
P(v,i) = hash(v ++ i)
will be defined for each sourcev
and each parti
. A node interested in obtaining a part will attempt to request it from the possible sources in order of their priority. By doing so we enable network nodes to reach a rough consensus on which nodes should serve which parts, without a need for perfect synchronization.The
SnapshotHostsCache
should expose an APIselect_host(sync_hash: CryptoHash, sid: ShardId, pid: PartId) -> Option<PeerId>
. The arguments specify a specific part and the function should select a peer from which to request the part. In case the same arguments are passed multiple times (likely due to failed attempts to obtain the part from the previously selected hosts), a different host should be selected on each call, going down the list of possibilities in their priority order.Implementation details to consider include:
P(v,i)
for different partsi
should be different. Any reasonable choice should work here.SnapshotHostsCache
will need to maintain some kind of internal state tracking which hosts were previously returned by calls toselect_host
. It should be reasonably efficient and should not grow without bound over time.lru::LruCache
seems like a reasonable candidate.select_host
.P(v, i_0)
(for example in a balanced binary search tree).