paritytech / smoldot

Alternative client for Substrate-based chains.
GNU General Public License v3.0
304 stars 74 forks source link

Internet connectivity and finality gap #2382

Open tomaka opened 2 years ago

tomaka commented 2 years ago

Many design choices in smoldot are based upon the assumption that the list of non-finalized blocks is not very large.

This is normally the case. Blocks are verified before they enter the list. And Substrate has a mechanism that slows down block production if finality stops working.

However there's a problem in smoldot: it detects finality by waiting for the nodes it is connected to to send a finality proof. If nodes don't send this message, then the list of non-finalized blocks will get progressively larger.

In practice, nodes, in order to limit their bandwidth usage, only randomly send this message at each block height. This means that there is realistically a chance for this message to not arrive for maximum 2/3 blocks in a row. However, there are two situations that are problematic:

tomaka commented 2 years ago

Many design choices in smoldot are based upon the assumption that the list of non-finalized blocks is not very large.

To clarify this sentence: if everything is coded properly (the APIs certainly are, but there might be implementation bugs), the memory usage of a smoldot chain is capped at aN + b where a and b are compile-time constants, and N is the current number of non-finalized blocks that are currently floating around the network of the chain.

If we allow N to grow very large, then the memory usage is no longer capped. Importantly, it doesn't mean that the memory usage will explode, just that it is no longer capped. However, this means that an attacker could potentially find ways to make smoldot consume a lot of memory and crash due to OOM.

It is also for this reason that smoldot doesn't support "no consensus mechanism". If anyone is allowed to generate blocks, then N is effectively unbounded.

tomaka commented 1 year ago

Maybe a simple work-around is to introduce, in smoldot, a maximum gap between the finalized and new block. If a block needs to be verified and it is too far away from the finalized block, then we don't verify it yet. There is already a queue of non-verifiable-yet blocks that is designed to be DoS-resilient, and we could simply leave the block in that queue.

This maximum gap could be very large, for example 200000 blocks (~ two weeks, should be roughly ~500 MiB or so), so that it is normally never reached in practice.