ethereum / ddht

Python implementation of Discovery V5 Protocol
MIT License
18 stars 16 forks source link

Efficiency goals #339

Open pipermerriam opened 3 years ago

pipermerriam commented 3 years ago

The current master 5738cd8245c16be587f344dafb3150dd266b8f15 fully demonstrates the manner that the network is intended to operate.

The network however does not operate at an efficiency level suitable for hosting the entirety of the mainnet chain history.

Mainnet Numbers

The mainnet has 12 million headers/blocks/receipts. We can optionally add the canonical chain index to this list which ads another 12 million. That gives us ~48 million things that need to be stored today. In order to be future proof, we can scale this up to 100 million things which should be adequate for the immediate foreseeable future.

These things are on average 1kb in size. Block headers are about 550 bytes. Early blocks and receipts are very small. Later blocks and receipts are in the 50kb range. In total this data is about 100GB.

For the network to operate healthily it needs the data to be replicated across a few nodes. We'll pick 10 as our desired minimum replication factor.

So, with 100 million things and a 10x replicatiion factor we need to be able to store 1 billion things in the network. The current mainnet is comprised of about 10,000 nodes which we can use as a baseline, meaning that each node on the network would need to store 100,000 things.

Under the current network architecture, which uses advertisements to locate data, each node needs to be able to fully advertise its content in a reasonable amount of time. 1 hour seems to be a reasonable bound to place on this, which gives us a minimum rate of advertisement that needs to be reached, 100,000 things / 1 hour -> 28 things / second.

At present the client is running at about 1 thing/second.

There is plenty of room to gain this efficiency. Currently, each advertisement is comprised of:

Places where we can gain efficiency:

These things should easily give us a significant boost in advertisements/second.

pipermerriam commented 3 years ago

I think the first step in this is to measure our maximum messages/second throughput both in the core protocol and then in the alexandria sub-protocol so that we know the theoretical limit.