ethereum / portal-network-specs

Official repository for specifications for the Portal Network
286 stars 79 forks source link

Verkle - forging fake inner-root / extension-root nodes #297

Open morph-dev opened 1 month ago

morph-dev commented 1 month ago

I will focus on inner-root node as it's simpler, but the same thing would apply to extension-root as well. I will refer to inner-root as root and inner-sub as sub for simplicity reasons.

Design reminder

Instead of storing entire Verkle's inner node (with all 256 children), we split it into 17 smaller nodes: 1 root and 16 sub nodes, each with 16 children.

verkle_inner_node

The commitment of the sub nodes is a Pedersen Commitment, using appropriated indices:

C'0  =   c0 B0   +   c1 B1   + ... +  c15 B15
C'1  =  c16 B16  +  c17 B17  + ... +  c31 B31
...
C'15 = c240 B240 + c241 B241 + ... + c255 B255

And the commitment of the root node is the sum of commitments of sub nodes, and is identical to the commitment of the original Verkle node.

C = C'0 + C'1 + ... + C'15

Problem

The issue is that somebody can easily forge values for the root node, which would add up to C, i.e.:

C = C"0 + C"1 + ... + C"15

They can go step further and create 15 sub tries however they want and only make sure that the 16th commitment is what needs to be in order to have correct sum.

However, if they picked any of the 16 commitments to be some precise value (which they need to in order to make the correct sum), they can't prove that it is the correct vector commitment. This is what solution is aiming at.

Solution

Together with 16 commitments (C'0 ... C'15), root node can also store proof that those commitments are valid. There are two main approaches for creating such proof.

Approach 1 - Use IPA and multiproof

The simplest approach would be to create IPA (Inner Product Argument) and multiproof in the same way that they work for Verkle trie. The proof would prove that:

C'0  is commitment of some vector [_, _, ..., _,  0, 0, ..., 0,  ...,  0, 0, .... 0]
C'1  is commitment of some vector [0, 0, ..., 0,  _, _, ..., _,  ...,  0, 0, ..., 0]
...
C'15 is commitment of some vector [0, 0, ..., 0,  0, 0, ..., 0,  ...,  _, _, ..., _]
                                   --first 16--   --second 16--        --last 16 --

Where _ is omitted from the proof, while 0 is present to the proof. Meaning, it would prove that each C'x is vector commitment to the same basis (B0 ... B255) and 240 values are 0 (at appropriate indices). Using multiproof, this can be compressed to 18x32 = 576 bytes.

Sidenote 1 - multiproof with different basis

One might ask why not make a proof that each C'x is vector commitment to its own basis of length 16, instead of having 240 openings of value 0. The answer is that I don't know if multiproof can work with different basis.

Sidenote 2 - why do we need to prove that values are 0

If we use the same basis (B0 ... B255), we have to prove that values are 0. Otherwise, somebody can create proof for following vectors:

C"0 is commitment to [_, _, ..., _,  0, 0, ..., 0,  ...,  0, 0, ...,  42]
C"1 is commitment to [0, 0, ..., 0,  _, _, ..., _,  ...,  0, 0, ..., -42]
                      --first 16--   --second 16--        --- last 16 ---
Meaning:
C"0  =  c0 B0  +  c1 B1  + ... + c15 B15   + 42 B255 = C'0 + 42 B255
C"1  = c16 B16 + c17 B17 + ... + c31 B31   - 42 B255 = C'1 - 42 B255

Because C'0 + C'1 = C"0 + C"1 and because proof is valid, we wouldn't notice any wrong doing (C'0 ≠ C"0 and C'1 ≠ C"1). So somebody can pollute our network and we wouldn't be able to notice it.

Approach 2 - Different proving scheme

If we consider that 16 commitments are commitments to 16 different vectors (different basis), and that we don't need opening for any particular value, it might be possible to use different type of proofs. Ignacio said that this might be possible with SNARK/STARK proofs. This would only be beneficial if such proof is smaller than the one from above.

However, I'm not familiar with SNARK?STARK proofs, so more research is needed in this direction.