Simplify and optimize voucher executed flag storage

guidanoli commented 9 months ago

📚 Context

Currently, we are storing whether a voucher was executed or not in a bit mask. The 256-bit key to this bit mask is a concatenation of the 128-bit input index and 128-bit output index (within input). This method is not only potentially unsafe but unnecessarily complex.

✔️ Solution

We can maintain the locality of bits among vouchers of the same output index, while simplifying on-chain. So, instead of having a single bit mask for all vouchers, we can have a mapping of bit masks, index by the output index. We index first by output index because inputs normally generate only a few vouchers. With this, there will not be the need to calculate voucher "positions" in the bit mask.

pedroargento commented 9 months ago

Isn't it strange have a index pair with each element being resolved a different way? We have for options, right?

2 mappings
Mapping output index, bitmask input index
bitmask output index, Mapping input index
2 bitmasks - current implementation

Do we have an idea about the gas costs differences?

guidanoli commented 9 months ago

My idea was the following:

mapping(uint256 => BitMaps.BitMap) internal voucherBitmaps;

Then, the wasVoucherExecuted function would look like this:

using BitMaps for BitMaps.BitMap;

function wasVoucherExecuted(
    uint256 _inputIndex,
    uint256 _outputIndexWithinInput
) external view override returns (bool) {
    return voucherBitmaps[_outputIndexWithinInput].get(_inputIndex);
}

guidanoli commented 9 months ago

This also saves us some gas:

Contract	Function	Min	Avg	Max
CartesiDApp				-39840 (-3.03%)
CartesiDApp	executeVoucher		-1826 (-3.13%)	-3882 (-3.69%)
CartesiDApp	wasVoucherExecuted	-795 (-56.30%)	-1629 (-55.94%)	-3295 (-55.73%)

ZzzzHui commented 9 months ago

IMO, this solution is as complex as the current implementation, as the current implementation is simply one whole bitmask. But it is definitely much safer and more gas efficient. So I support this issue. The even more gas efficient path would be to unify the output index, which means to make it global rather than being indexed under inputs. Do I remember correctly that someone proposed this "unified output index" before? If that were to come true, then it would again be simply one bitmask, indexed by global output index.

guidanoli commented 8 months ago

@ZzzzHui Yes, things will get even better with Output Unification v2!

ZzzzHui commented 8 months ago

The current implementation is good for inputs that have lots of outputs. The proposed solution in this issue is good for inputs that have only a few outputs. The future solution with Output Unification v2 is the best in both scenarios. If Output Unification v2 isn't too far in the future, we could also consider doing the optimization then.

guidanoli commented 8 months ago

The current implementation is good for inputs that have lots of outputs.

Why?

guidanoli commented 8 months ago

If Output Unification v2 isn't too far in the future, we could also consider doing the optimization then.

I don't see why should we wait until Output Unification v2 to do this optimization. It will probably take some time to be implemented across the whole Cartesi Rollups stack.

ZzzzHui commented 8 months ago

The current implementation is good for inputs that have lots of outputs.

Why?

Because that scenario makes use of the storage to the fullest extent, no? For example, input 0 with 256 outputs occupies only 1 storage slot, while the proposed solution occupies 256 storage slots. On top of that, input 1 with 512 outputs occupies 2 slots, while the proposed occupies another 256 empty storage slots. And so on. Of course, this is just an extreme example to explain what I meant.

ZzzzHui commented 8 months ago

I don't see why should we wait until Output Unification v2 to do this optimization. It will probably take some time to be implemented across the whole Cartesi Rollups stack.

Ok. We should get it done then :)

guidanoli commented 8 months ago

input 0 with 256 outputs occupies only 1 storage slot

I don't think so. Currently, the position of a voucher in the bitmap is calculated like so... (source)

$$ position = 2^{128} \cdot outputIndex + inputIndex $$

Now, in the implementation of Bitmask, you can see that bit positions can be broken down like so... (source)

$$ position = 2^{248} \cdot bucketIndex + bitIndex $$

So, if $inputIndex = 0$, you'll have...

$$ position = 2^{128} \cdot outputIndex $$

Breaking it down between $bucketIndex$ and $bitIndex$, we have...

$$ 2^{128} \cdot outputIndex = 2^{248} \cdot bucketIndex + bitIndex $$

Since $bitIndex < 2^8$, we can safely say $bitIndex = 0$. With this we have...

$$ 2^{128} \cdot outputIndex = 2^{248} \cdot bucketIndex $$

Simplifying even further, we have...

$$ bucketIndex = 2^{120} \cdot outputIndex $$

So, for each of the 256 outputs of index 0, the Bitmask implementation will use 256 different buckets, and, therefore, 256 different storage slots. So there is no reusing of buckets for outputs of the same input.

ZzzzHui commented 8 months ago

Currently, the position of a voucher in the bitmap is calculated like so... (source)

$position=2^{128}⋅outputIndex+inputIndex$

My mistake. I thought it as $position=2^{128}⋅ inputIndex +outputIndex$. Sorry about that. That might be the impression I got from the context of this issue, which says

The 256-bit key to this bit mask is a concatenation of the 128-bit input index and 128-bit output index (within input).

So I thought input index first and then output index

ZzzzHui commented 8 months ago

Solution ... We index first by output index because inputs normally generate only a few vouchers.

I mistakenly thought this was the main source of gas savings. But this is the same as the current implementation. So the gas was saved because there's no need to do this calculation (voucher << 128) | input right?

guidanoli commented 8 months ago

So the gas was saved because there's no need to do this calculation (voucher << 128) | input right?

OpenZeppelin's implementation could also be more gas efficient.

ZzzzHui commented 8 months ago

I see. The gas savings shown above is the combination of this issue and the other issue.

guidanoli commented 8 months ago

I see. The gas savings shown above is the combination of this issue and the other issue.

Yes, sorry for the lack of clarity.

cartesi / rollups-contracts

Simplify and optimize voucher executed flag storage #171

📚 Context

✔️ Solution