cartesi / rollups-contracts

Smart Contracts for Cartesi Rollups
https://cartesi.github.io/rollups-contracts/
Apache License 2.0
21 stars 39 forks source link

Merkelize inputs #114

Closed guidanoli closed 6 months ago

guidanoli commented 1 year ago

📚 Context

Currently, we use the blockchain as a data availability layer for Cartesi DApps. This achieved through a shared smart contract called InputBox, which maintains an append-only list of inputs for each DApp.

When an input is added, this smart contract emits an event with all the information needed to reconstruct the input off-chain. On-chain, only the hash of the input is kept in storage to save gas.

On October 26th, @GCdePaula mentioned that Dave would much benefit if the InputBox stored the Merkle root of the input instead of the hash.

✔️ Solution

Modify the InputBox to make the input hash be the Merkle root of the input. We should use the MerkleV2 library for this purpose.

augustoteixeira commented 1 year ago

As I mentioned in Discord, this would probably increase quite a bit the cost of inputs.

Why don't we just hash it and require that its length is not too large, so that we are sure that we can merkelize it later?

We could even store on-chain a map like this:

  mapping(bytes32 => bytes32) merkelized;

This could be unpermissioned and everyone who wants to call the following function could:

function storeMerkle(bytes input) {
  require(input.len <= MAX_LEN, "Input too large");
  bytes32 k = keccak(input);
  bytes32 m = merkle(input);
  merkelized[k] = m;
}

Then, the solidity step will expect this mapping to be filled beforehand and it will simply ask for the Merkle version of the input.

guidanoli commented 1 year ago

As I mentioned in Discord, this would probably increase quite a bit the cost of inputs.

From a quick benchmark, it seems that this change would increase the gas cost of adding inputs by 190%-840%. The larger the input, the more expensive it would be to calculate its Merkle root it, which is expected.

Input Size Hash (gas) Merkle (gas) Gas cost increase
1 53032 152960 188,43%
2 53044 152972 188,39%
4 53068 152996 188,30%
8 53116 153044 188,13%
16 53212 153140 187,79%
32 53407 153332 187,10%
64 54193 166573 207,37%
128 55766 177880 218,98%
256 58911 203708 245,79%
512 65201 261481 301,04%
1024 77785 377171 384,89%
2048 102965 610069 492,50%
4096 153373 1078533 603,21%
8192 254381 2029784 697,93%
16384 457165 3988481 772,44%
32768 865805 8131380 839,17%

Note: This would be greatly amortized with input aggregation.

augustoteixeira commented 1 year ago

I don't understand why input aggregation would amortize this.

I agree that aggregation amortizes the fixed costs of adding inputs, but for Merklezation, it is the opposite, no? As we aggregate more inputs, their sizes increase and we start paying up to 800% more gas.

One could for example add automatic Merklezation in the first version if it simplifies the disputes. But in the long run, I see this as something to be changed. Specially because of the following argument.

If an application is launched in an L2, it pays data availability in L1 and computation in L2. If EIP 4844 or sharding makes DA cheaper, the computational L2 cost would account for the majority of the fees to run the application. And this Merklezation is one such overhead.

guidanoli commented 1 year ago

@augustoteixeira I agree with you, and this input hash to input Merkle root hash oracle seems like a nice workaround indeed. In this case, we would still need to limit the size of the input, right? I can make some benchmarks and figure out what is the maximum input size for which this storeMerkle function consumes a manageable amount of gas.

augustoteixeira commented 1 year ago

Such a limit can be modified in the future very easily. One could start with 16k, which is looks pretty healthy from your tests (comfortably below the 15M target limit on L1).

What do you think?

Again, this separation of hashing/Merklezation can be delayed to the future (and one could start with auto-Merklezition) if it simplifies things.

guidanoli commented 1 year ago

@augustoteixeira I agree with limiting the input size, but still only storing its keccak256, and having a separate contract that computes the Merkle root hash of blobs, which can be used for Dave.

But maybe @GCdePaula has a different position in this, I don't know.

guidanoli commented 7 months ago

@GCdePaula Would this contract that translates keccak256 hashes into Merkle root hashes be part of Dave or rollups-contracts? In the former case, we can close this issue.

GCdePaula commented 6 months ago

I'd say part of Dave!