proto-kit / framework

Apache License 2.0
28 stars 10 forks source link

Add linked merkle tree as the primary state tree for the protocol #191

Open rpanic opened 3 months ago

rpanic commented 3 months ago

Linked Merkle Tree

The purpose of this spec is to walk through a detailed description of the state tree primitive we call "Linked Merkle Tree".

Heavy inspiration from: https://docs.aztec.network/learn/concepts/storage/trees/indexed_merkle_tree

A similar design to this spec exists for o1js: https://github.com/o1-labs/o1js/issues/1655

Definitions:

Index / Leaf Index: Number used to identify a unique leaf in a merkle tree Tree domain: The maximum number of leafs a particular tree can store (e.g. 2^256) Path: The identifier used to describe a state entry (computed from the state defintion)

Goal

The goal is to reduce

  1. Number of in-circuit constraints, therefore increasing the Batchsize of the StateTransitionProver

Considerations / Tradeoffs:

Time for the sequencer to finish sequencing and tracing:

  1. Sequencing only needs to apply all state changes to the tree without creating individual witnesses or such, because it only need to arrive at the resulting state root to feed into the next block
  2. Tracing needs to generate individual witnesses for each state transition

Implementation

This is based on the aztec description of the indexed merkle tree.

2 Operations exist

Reading

Reading involves two steps:

Witness fetching (out-of-circuit)

This part looks into the database and does the following 2 steps sequentially:

  1. Finding the leaf with a certain path If the tree doesn't already contain a entry with certain path, we have to find the leaf with the next lowest path

  2. Retrieving the merkle witness for that node

Witness verification (in-circuit)

We also consider a two step process here:

  1. Asserting the value of the leaf $v$

    For already existing entry:
    v.path == input.path
    For a non-existing entry:
    v.path < input.path
    v.nextPath > input.path
    
    If true, we can safely assume the result value to be 0
  2. Asserting the inclusion of that leaf $v$ in a tree root $r$

We compute the root of the tree given a merkle witness and the leaf of step 1. (checkMembership) The result of that has to equal $r$.

Inserting

To insert, the intuition is that we need to find the leaf with the next lower path ($prev$) and then insert ourselfes ($this$) in between that and it's next leaf ($next$). This can be done by asserting $prev.nextPath > this.path$

Then we insert the new leaf ($this$) at the next empty leaf index with ${ next = previous.nextPath, value, path }$

At last, we need to update the previous leaf to point to our new leaf index. So we set $previous.nextPath = this.path$

Therefore the rough pseudocode algorithm is:

We intialize the tree with one leaf at index 0 { path = 0, nextPath = max_path, value = 0 }

Insert:

Trusted inputs: `root, nextUsableIndex`
Untrusted inputs: `witnessValue, witnessPrevious` 
(witnesses here are assumed to be pairs of (MerkleTreeWitness, LinkedLeaf))

// Update previous leaf
check membership of previous witness
assert previous.nextPath > this.path
compute root `root1` of previous with { next = witnessValue.leaf.path, path, value }

// Insert new node
check membership of value witness on `root1`
check index of value witness to be nextUsableIndex
nextUsableIndex++
nextPathValue = witnessPrevious.next
compute root `root2` of value with { next = nextPathValue, value: this.value, path: this.path }

return root2

Updating

Updating is trivial, we need to only check the witness and compute the new root with the updated value. But since we are bounded by the maximum number of operations in circuit anyways, we will consider inserting as a more general case, where updating is included. But we can make some optimizations regarding updates, which we will cover later.

So what we can do additionally in the future, since the computeRoot and checkMembership calls are the most expensive generally, and both exist in both parts of the program, is that we can reuse any unused call for cases where we only update and therefore save ourselves these calls also in provable code.