powdr-labs / powdr

A modular stack for zkVMs, with a focus on productivity, security and performance.
Apache License 2.0
382 stars 78 forks source link

Sound Continuations #814

Open georgwiese opened 9 months ago

georgwiese commented 9 months ago

We started implementing non-sound continuations in #789. In this Issue, I'd like to collect all steps necessary to make it sound.

Background:

The main idea is outlined here and highly influenced by RISC Zero's approach.

In summary, at the beginning of the program, we run a bootloader, which lets the prover set all registers and initialize the fresh memory with the content needed for this chunk. For this to be sound, (commitments of) the memory and register values have to be exposed as public outputs for the initial values of the chunk and the final values. With these commitments, the verifier can verify that a chunk starts from where the previous left off. The existing constraints of the zk-VM make sure that the transition from the initial to the final state was correct.

Memory content is organized in pages (e.g. of size 1KB, or 256 32-Bit words). The commitment of memory is a Merkle root hash of the list of pages. This way, the prover can sparsely page in the memory content needed for this chunk while still being able to prove that it is part of the committed memory.

Making it sound:

This takes the following steps:

Details:

This section makes some of the later items more concrete.

Merkle-proof the containment of each page in the memory commitment

This can be implemented in assembly as part of the bootloader. Once for each page, it would:

Hash pages at the end of the execution

This can be implemented by having a "shutdown routine" assembly program analogous to the bootloader that runs until the end of the execution. The tricky part is that we don't know at compile time when we need to jump there.

One solution would be to:

With these constraints, the only thing the prover can do is to jump to the shutdown routine just in the right time.

Update the Merkle root

Because we're currently implementing a 32-Bit RISC-V architecture, we can use a dense Merkle tree of $2^{22}$ leafs to represent the entire memory. In a dense Merkle tree, a single leaf can be updated by doing two Merkle proofs using the same siblings, for the old and new leaf value.

For this, it will be convenient to use the write-once memory introduced in #753, which lets the prover set free values while forcing the prover to provide the same value in different rows of the execution (because the program would read from the same address).

For example, the algorithm (run either in the bootloader or shutown routine) could be this:

  1. Prove membership of the first page in the initial memory Merkle tree (whose root hash is publicly exposed)
  2. Using the same siblings and the page hash at the end of the execution, compute the updated Merkle root (after updating just one page)
  3. Repeat steps 1-2 for all pages, continuously updating the Merkle root from step to step.
  4. Expose the last root hash as public output

To make this works, all page hashes (before and after the chunk execution) and all siblings need to be in write-once memory. This way, we can enforce that siblings are re-used. Also, the bootloader can already run this algorithm and access the page hashes after the execution, and the shutdown routine can just hash the pages and assert that the page hashes in write-once memory are indeed correct.

Make sure only paged-in memory are accessed

The algorithm in the previous section ensures that the same set of pages is written in the bootloader and used to update the memory hash. What's left to do is to prove that no other page is accessed during execution.

This could be implemented by adding a special mstore instruction only used by the bootloader, which sets a special flag to 1 in the memory machine (while the "normal" mstore instruction sets it to 0). Then, we can add a constraint in the memory machine that the first row of each address needs to have the flag set to 1.

georgwiese commented 8 months ago

Thinking about how to do the Merkle update proofs in the Bootloader. I see two possible approaches.

Setting

We have a Merkle tree of memory pages. The prover can specify the values of an arbitrary subset of those pages (which have to be proved to be in the tree) and hashes of the same pages at the end of the execution (which is validated by the shutdown routine). We want to show that updating the pages leads to a specific new Merkle hash.

Approach 1: Multi-proofs

If we adapt multi-proofs for the initial subset, we can use the same proof to compute the updated Merkle root, by plugging in the updated leaf hashes. This is an implementation of the verification algorithm.

Advantages:

Disadvantages:

Approach 2: Vanilla Merkle Proofs + chain of root hashes

The algorithm would be as follows:

This algorithm would be a thin wrapper around standard Merkle tree verification, which is already implemented in assembly. The prover would have to provide Merkle proofs for each page, but using the Merkle tree that has all previous pages updated already.

Conclusion

Approach 2 sounds simpler to me, so I think I'll try implementing that first.