vyperlang / vyper

Pythonic Smart Contract Language for the EVM
https://vyperlang.org
Other
4.91k stars 804 forks source link

VIP: Merkle Proof Function #1391

Open fubuloubu opened 5 years ago

fubuloubu commented 5 years ago

Simple Summary

Built in function to generate and validate merkle proofs for vanilla Sparse Merkle Tree implementations.

Abstract

Implement a built-in function for computing merkle roots from uncompressed merkle proofs for the sparse merkle tree (SMT) data structure, such as those commonly employed for Plasma and Plasma Cash constructions.

Motivation

Merkle proofs are useful and awesome, and can be difficult to implement due to many considerations and different possible optimizations. Having a built in function similar to ecrecover would greatly simplify in generating these proofs in a gas-efficient way.

Specification

def calc_smt_root(key: uint256,            # key determines order of siblings at each height
                  leaf_hash: bytes32,      # hash_function(abi.encode(leaf))
                  proof: bytes32[N],       # proof array where 'N' is the SMT depth (static)
                  function=hash_function,  # configures the function used
             ) -> bytes32                  # Returns calculated root for above

where

N:

  1. A constant corresponding to the size of the trie.
  2. The maximum number is 256. The minimum is 1 (although this would not make much sense in practice).
  3. The choice of depth determines the number of members that the trie can track.

key:

  1. An integer where each bit corresponds to the decision to traverse the tree left (bitand(key, 2**k) == 0) or right (bitand(key, 2**k) > 0) at the k-th node in the tree.
  2. Must be at least N bits in size (uint256 works by default, but future datatypes may be added, e.g. uint32 for 32-depth trie)
  3. The LSB (least significant bit) corresponds to the leaf node decision (whether it's sibling is left or right in the tree hierarchy recursing up). This is bit 0 of key.
  4. The MSB (most significant bit) corresponds to the root node decision (whether it's sibling is left or right in the final recursion that produces the root node). This is bit N of key (the depth of the trie).

leaf_hash:

  1. A bytes32 hash corresponding to the hash of the leaf node, typically with the same hashing function that the proof generation process uses.
  2. leaf can be any arbitrary data structure. By convention, the data structure would be hashed according to it's ABI encoding, although that is not strictly required.

proof[N]:

  1. A static array (of size N) where each array element corresponds to the sibling of the node at depth k in the trie.
  2. proof is in root to leaf order. Element 0 corresponds to the root level sibling (root_hash := hash(left_node, right_node)), and Element N corresponds to the leaf level sibling.
  3. The algorithm would thus require traversing the provided proof in reverse order.

Additional considerations:

  1. The function only calculates the root hash from the provided key, proof, and leaf hash. Verification is left to the end user (Typically self.root = calc_smt_root(...) or assert self.root == calc_smt_root(...), "Proof does not validate")
  2. The user specifies which hash function to use (e.g. calc_smt_root(..., function=hash_function) where hash_function is one of keccak256, sha256, or any other built-in hash function that produces at 32-byte output)
  3. There is a built-in helper function for generating empty proofs (e.g. EMPTY_BRANCH: bytes32[N] = generate_empty_proof(empty_leaf_hash, N, function=hash_function) where)
  4. This specific implementation structure is already implemented in py-trie (by the author)

An example of using this would be

root_hash: bytes32
N: constant(uint256) = 32

def validate_root(root_hash: bytes32, key: uint256, leaf: int128, proof: bytes32[N]):
    leaf_hash: bytes32 = keccak256(leaf)
    assert self.root_hash == calc_smt_root(key, leaf_hash, proof, function=keccak256)
    # ... do actions!

Backwards Compatibility

New feature

Copyright

Copyright and related rights waived via CC0

fubuloubu commented 5 years ago

@jacqueswww @pipermerriam @davesque If you all agree with this proposed Specification, we can move this VIP to Approved and start implementation, since previous (#806) was already approved (but was under-specified)

jacqueswww commented 5 years ago

:+1: from me.

pipermerriam commented 5 years ago

I abstain, haven't spent time understanding this and am focused elsewhere at the moment.

davesque commented 5 years ago

I'll also abstain from this for the same reasons as @pipermerriam .

fubuloubu commented 5 years ago

Notes:

fubuloubu commented 5 years ago

Not necessary to add documentation since this algorithm does not allow one to control N (the depth of the tree), which is a necessary condition to perpetuate this attack (since we control how many iterations are done).