celestiaorg / celestia-core

A fork of CometBFT
Apache License 2.0
488 stars 267 forks source link

Sampling and data extraction #35

Closed liamsi closed 3 years ago

liamsi commented 4 years ago

Motivation

After specifying all public APIs for the different kind of nodes (ref: https://github.com/lazyledger/lazyledger-specs/issues/22), we should start thinking on howto implement the different kind of networking / p2p related aspects of the system.

This issue is for discussing different approaches, e.g. which libraries and existing protocols to re-use and howto integrate them into lazyledger-core. Currently, this issue just aims to give a complete overview on what is missing from a highlevel perspective. Where it makes sense, the discussion will be broken up into dedicated issues (and further down PRs).

Summary

We need to add (at least) two p2p related features to vanilla tendermint/lazyledger-core: One is the random sampling LazyLedger light clients do, the other is a p2p filesharing network from which also fullnodes can recover the columns and rows of the extended erasure coded matrix M.

Copy and pasting the relevant pieces from the two academic papers here for now (TODO add refs):

(Light client) Random Sampling and Network Block Recovery

tl;dr LL light client randomly sample parts of the erasure coded block

The protocol between a light client and the full nodes that it is connected to works as follows:

  1. The light client receives a new block header h_i from one of the full nodes it is connected to, and a set of row and column roots R = (rowRoot_i^1, rowRoot_i^2, ...,rowRoot_i^{2k}, columnRoot_i^1, columnRoot_i^2, ...,columnRoot_i^{2k}). If the check root(R) = dataRoot_i is false, then the light client rejects the header.
  2. The light client randomly chooses a set of unique (x, y) coordinates S = {(x_0, y_0) (x_1, y_1), ..., (x_n, y_n)} where 0 < x <= matrixWidth_i and 0 < y <= matrixWidth_i, corresponding to points on the extended matrix, and sends them to one or more of the full nodes it is connected to.
  3. If a full node has all the shares corresponding to the coordinates in S and their associated Merkle proofs, then for each coordinate (x_a, y_b) the full node responds with M_i^{x_a,y_b}, {M_i^{x_a,y_b} → rowRoot_i^a} or M_i^{x_a,y_b}, {M_i^{x_a,y_b} → columnRoot_i^b}. Note that there are two possible Merkle proofs for each share; one from the row roots, and one from the column roots, and thus the full node must also specify for each Merkle proof if it is associated with a row or column root.
  4. For each share M_i^{x_a,y_b} that the light client has received, the light client checks VerifyMerkleProof(M_i^{x_a,y_b}, {M_i^{x_a,y_b} → rowRoot_i^a}, rowRoot}_i^a) is true} if the proof is from a row root, otherwise if the proof is from a column root then VerifyMerkleProof( M_i^{x_a,y_b}, {M_i^{x_a,y_b} → columnRoot}_i^b}, columnRoot_i^b,) is true}.
  5. Each share and valid Merkle proof that is received by the light client is gossiped to all the full nodes that the light client is connected to if the full nodes do not have them, and those full nodes gossip it to all of the full nodes that they are connected to.
  6. If all the proofs in step 4 succeeded, and no shares are missing from the sample made in step 2, then the block is accepted as available if within 2 * δ no fraud proofs for the block's erasure code is received.

Data extraction

tl;dr nodes need to recover erasure coded blocks and we need a DHT-like structure/protocol to request shares from the network in a decentralized way.

Given a full node that wants to recover a matrix M_i associated with block i,the extraction process proceeds as follows:

  1. The full node picks a set of random shares that it does not have, and samples them from one or more of the full nodes it is connected to, using the same random sampling protocol above.
  2. If as a result of downloading any of new share, the row or column that the share is in has greater than k+1 recovered shares, then recover the whole row and/or column with recover (see spec / todo).
  3. If as a result of the previous step, any incomplete row or column in M_i has greater than k+1 recovered shares, then recover the whole row or column with recover. Repeat this step until M_i does not change and no new rows or columns are recoverable.
  4. Repeat from step 1. until the whole matrix is recovered.

Note that in step 5. (above), it is also mentioned that light clients gossip shares to full nodes that do not have them. We want to generalize the share gossiping mechanism among full nodes as a peer-to-peer file-sharing network (such as BitTorrent or IPFS) to enable full nodes to download shares that they do not have, as an alternative to step 1.

musalbas commented 4 years ago

I believe both of the above features mentioned here are in fact the same feature: the ability to query specific chunks from blocks, from the P2P network (e.g. using a DHT).

The actual second feature we need is for nodes to be able to query messages from blocks based on their namespace, rather than their index in the block. The former is needed for clients to be able to download the messages for their application, and the latter is needed for clients to be able to do data availability checks and recover full blocks.

liamsi commented 4 years ago

The actual second feature we need is for nodes to be able to query messages from blocks based on their namespace

I think, a simple way to implement this is would be a RPC service for the Namespaced Merkle Tree (the already offers this feature) backed by a DB. The p2p layer could be used to find nodes that offer this capability. On the p2p-layer this would only require some form of "service discovery". Or are there any reasons this should work similarly to the other feature (querying chunks from block directly from a p2p network)?

musalbas commented 4 years ago

It can be part of the same service/DHT, but my point is that the "two features" in your original post are actually the same feature, and the actual second feature we need is querying by namespace. The only difference is that you need to query chunks by namespace, not by index.

liamsi commented 3 years ago

With https://github.com/lazyledger/lazyledger-core/issues/85#issuecomment-746192395, the sampling and data extraction can be achieved via querying the IPFS DAG:

The coordinates only need to be translated into a path that the ipld plugin can resolve: e.g. the coordinate (0,0) could correspond to CID/0/...0/0. A dag get request would return back the underlying share M_i. This can be used for both the light clients (or validator nodes that choose not to download message to validate the erasure coding) to validate the block (as in validate its data availability) as well as nodes that want to recover the whole block (which would need to traverse the whole tree).

We can later opimize this via graphsync instead of bitswap which aims to reduce the number of roundtrips when resolving a path.

liamsi commented 3 years ago

I think this issue is rather confusing and the ADR introduced in #170 as well as the issue #221 better capture the work that needs happen.