The current master 5738cd8245c16be587f344dafb3150dd266b8f15 fully demonstrates the manner that the network is intended to operate.
Nodes broadcast advertisements for content they have stored locally.
Nodes maintain a database of advertisements for content that is "near" them
Nodes gather and serve content that is "near" them
Nodes can lookup content and retrieve it from the network
The network however does not operate at an efficiency level suitable for hosting the entirety of the mainnet chain history.
Mainnet Numbers
The mainnet has 12 million headers/blocks/receipts. We can optionally add the canonical chain index to this list which ads another 12 million. That gives us ~48 million things that need to be stored today. In order to be future proof, we can scale this up to 100 million things which should be adequate for the immediate foreseeable future.
These things are on average 1kb in size. Block headers are about 550 bytes. Early blocks and receipts are very small. Later blocks and receipts are in the 50kb range. In total this data is about 100GB.
For the network to operate healthily it needs the data to be replicated across a few nodes. We'll pick 10 as our desired minimum replication factor.
So, with 100 million things and a 10x replicatiion factor we need to be able to store 1 billion things in the network. The current mainnet is comprised of about 10,000 nodes which we can use as a baseline, meaning that each node on the network would need to store 100,000 things.
Under the current network architecture, which uses advertisements to locate data, each node needs to be able to fully advertise its content in a reasonable amount of time. 1 hour seems to be a reasonable bound to place on this, which gives us a minimum rate of advertisement that needs to be reached, 100,000 things / 1 hour -> 28 things / second.
At present the client is running at about 1 thing/second.
There is plenty of room to gain this efficiency. Currently, each advertisement is comprised of:
1x Network.broadcast
1x Network.explore to source nodes for broadcast
1x Network.recursive_find_nodes to seed the exploration
1x `Network.ping/pong exchange with each node that we find to check liveliness
1+ Network.find_nodes for each node to further seed the explore
1x Network.advertise to send the advertisement.
Places where we can gain efficiency:
The Network.explore can be cached by continually maintaining a slightly out-of-date snapshot of the network and returning data from the snapshot.
The Network.ping/pong liveliness check can be skipped if we know the node was recently reachable.
These things should easily give us a significant boost in advertisements/second.
I think the first step in this is to measure our maximum messages/second throughput both in the core protocol and then in the alexandria sub-protocol so that we know the theoretical limit.
The current master 5738cd8245c16be587f344dafb3150dd266b8f15 fully demonstrates the manner that the network is intended to operate.
The network however does not operate at an efficiency level suitable for hosting the entirety of the mainnet chain history.
Mainnet Numbers
The mainnet has 12 million headers/blocks/receipts. We can optionally add the canonical chain index to this list which ads another 12 million. That gives us ~48 million things that need to be stored today. In order to be future proof, we can scale this up to 100 million things which should be adequate for the immediate foreseeable future.
These things are on average 1kb in size. Block headers are about 550 bytes. Early blocks and receipts are very small. Later blocks and receipts are in the 50kb range. In total this data is about 100GB.
For the network to operate healthily it needs the data to be replicated across a few nodes. We'll pick 10 as our desired minimum replication factor.
So, with 100 million things and a 10x replicatiion factor we need to be able to store 1 billion things in the network. The current mainnet is comprised of about 10,000 nodes which we can use as a baseline, meaning that each node on the network would need to store 100,000 things.
Under the current network architecture, which uses advertisements to locate data, each node needs to be able to fully advertise its content in a reasonable amount of time. 1 hour seems to be a reasonable bound to place on this, which gives us a minimum rate of advertisement that needs to be reached,
100,000 things / 1 hour -> 28 things / second
.At present the client is running at about 1 thing/second.
There is plenty of room to gain this efficiency. Currently, each advertisement is comprised of:
Network.broadcast
Network.explore
to source nodes for broadcastNetwork.recursive_find_nodes
to seed the explorationNetwork.find_nodes
for each node to further seed theexplore
Network.advertise
to send the advertisement.Places where we can gain efficiency:
Network.explore
can be cached by continually maintaining a slightly out-of-date snapshot of the network and returning data from the snapshot.Network.ping/pong
liveliness check can be skipped if we know the node was recently reachable.These things should easily give us a significant boost in advertisements/second.