Closed gligneul closed 2 months ago
@diegonehab @GCdePaula @guidanoli
Nice writeup! I think I just disagree with the last part. In any case, I'll list all changes, let me know if I got anything wrong.
I believe there are four main changes compared with the current implementation after unifying the outputs:
@gligneul I think (3) is the only disagreement. Did I get the rest correctly?
@augustoteixeira I think you may find this interesting.
@GCdePaula it is possible for the claim to be just the machine hash. However, if we do that, when proving an output, we also need the proof that the output tree root is inside the machine. I don't think this is a problem.
@gligneul Yeah, it would need an extra proof.
Like, I guess with the current epoch also kinda needs a proof, which is just two hashes (output trees hash and machine state hash).
With this change, you'd need a much larger merkle proof. This would suffer of the "32-bytes hash split in four words and merkleized" thing. Suppose the location of the output hash is well aligned. The function would receive the outputs tree hash (TH) and a merkle proof of a memory range of four words. We first merkleize TH in the granularity of words (MTH), then use the merkle proof to prove that MTH is inside the machine.
As an optimization, since the proof and outputs tree is constant, when a claim is made, we could pass the proof along with the claim and have the smart contract validate the proof and save the extracted outputs tree.
Great, @gligneul!
I think that this will indeed simplify a lot all the pieces of the architecture. From the nodes, to the disputes.
Great job! Super interesting proposal. This would significantly enhance the current system.
However, I'd like to offer a different perspective on the part where you mentioned, "Once a claim is accepted after the dispute period, the previous one can be discarded because validating any previous output with the latest claim is possible."
My main concern is that this approach would require users to continuously update their proofs for every output they wish to execute. For instance, if a user intends to execute an old output for which they already have a proof, they would need to resync their entire dapp to the most updated state. This could be cumbersome and might deter seamless interactions. Moreover, even if your node is already up to date, there's a potential risk of getting frontrunned by a fresh claim, leading to a transaction getting reverted.
A potential remedy I'd suggest is to retain all claims rather than overwriting them. In this way, proofs could target any valid historical claim, offering more flexibility and less dependency on the latest claims. The executeOutput
function signature could be something like: executeOutput(claim, proof, output)
, where the system would merely verify if the claim
exists within the storedall_claims
.
This might seem like a slight optimization on the surface, but the added convenience of executing old outputs without necessitating a full dapp sync could be invaluable for end-users.
Again, great work on the proposal! I'm eager to see how this evolves.
I agree with @felipeargento that we should allow users to prove the validity of outputs with any previously submitted claim, for usability reasons.
By the way, for anyone who might be interested in the details of the discussion and wants to see more material.
The idea detailed by @gligneul has inspiration in one from @GCdePaula, described in this Research Meeting video:
https://drive.google.com/file/d/1jG-o8I8vwDw3ka6wYrISj4SuQIpX6Sbj/view?usp=sharing
More precisely at 26:40
📚 Context
The output proofs in the Cartesi rollups are too complex to understand, so I'm proposing a significant simplification.
A single tree to rule them all
The Cartesi rollups should have a single Merkle tree that contains all the DApp outputs. The leaves of this tree will have 32 bytes and will be the output hash. This tree starts empty when the DApp starts. As the DApp advances, the rollups will add the outputs to the tree, one after another. There will be no separation per input or epoch. Instead, the tree will hold all outputs since genesis.
Inside the Cartesi machine, the Rollups C library will maintain an optimized version of this tree. This optimized version can compute the tree root hash as new outputs are added. It will only require log n space and time complexity over the size of the tree to compute the root hash. The Cartesi machine should provide a way to obtain the tree's root hash.
The rollups reader node will maintain a full version of the tree. The reader will update this tree alongside the one in the Cartesi Machine, which should yield the same root hash. Every time a new output is added to the tree, the root hash changes, so keeping the full tree to compute the proof of a given output is necessary. We can optimize this tree by assuming that all empty leaves are zero. However, storing the tree will take O(n) space if it is full.
No more epochs
The rollups claim should be the hash of the Cartesi Machine plus the hash of the outputs Merkle tree. The claim can be done at any input (I think?). Once a claim is accepted after the dispute period, the previous one can be discarded because validating any previous output with the latest claim is possible. Remember, the output Merkle tree contains all the outputs. The reader node should be aware of the current claim in the blockchain to provide the correct proof for the outputs. So, we need to identify the claim by the input index.