Open staheri14 opened 1 year ago
just leaving a thought here: I think we'll have the functionality to verify an out of order nmt inclusion proof, but also have some way of notifying the user that one or more nodes was out of lexicographical order. This way, we could prove that an invalid order was committed to.
Per conversation with @evan-forbes, I'd like to provide more detail on the attack scenario that prompted the opening of this issue:
Malicious consensus nodes could produce and commit a block, denoted as B
, in which the shares are not ordered lexicographically based on their namespace IDs. A DA (data availability) full node obtains the block header, including the data root denoted as dataRoot
, and the underlying shares, and will attempt to construct the block. However, if the node detects that the shares used to construct dataRoot
are unordered, it must provide a proof of bad encoding to the light clients. Specifically, it must indicate that there are out-of-order leaves in the tree represented by dataRoot
.
The problem is that the current NMT (namespace Merkle tree) implementation does not support this kind of bad-encoding proof. Furthermore, the library does not even allow for the construction of a tree with unordered leaves, since it is protected through the Push
method, which checks the namespace ID of the inserted data items. Therefore, the issue at hand is to implement new logic in the NMT library that would allow a full node to provide such bad-encoding proof to the light clients.
It might be already covered by the existing bad encoding fraud proof https://github.com/celestiaorg/celestia-node/blob/main/share/eds/byzantine/bad_encoding.go, we need to check.
Update: The current bad encoding fraud proof logic does not capture this issue, the NMT construction errors out on a set of unordered leaves, disallowing the full node from calculating the row and column root. This means the full node cannot generate a bad encoding proof if shares are not ordered in a row or column.
During my call with @evan-forbes, we discussed the potential benefits of enabling light nodes to identify out-of-order leaves when sampling and verifying their inclusion proof, in addition to the full nodes. The adversarial setting is similar to the one described in this comment, but with the added possibility of the full node being malicious and not providing fraud proofs for unordered leaves.
During sampling, a light client may encounter proof of inclusion of shares that are out-of-order (for which the proof verification results in false). While light clients reject such a proof, we would like to enable them to craft a bad encoding proof and propagate it in the network. This may require some changes in the verification methods currently available in NMT, such as the ability to calculate the root even if a violation is detected in the proof.
Problem
The issue at hand pertains to the generation of NMTs from sets of leaves that are not ordered lexicographically based on their namespaces. The behavior of NMT proofs generated from such malformed trees is unclear and needs to be addressed. The main objectives of this issue are to: