rooch-network / rooch

VApp Container with Move Language
https://rooch.network
Apache License 2.0
128 stars 54 forks source link

L1 to L2 messaging retry mechanism #829

Open popcnt1 opened 8 months ago

popcnt1 commented 8 months ago

L1 to L2 messaging can fail due to:

  1. not enough gas (main reason)
  2. unusual state on L2

We need to build this mechanism avoiding losing asserts (locked on L1, but failed to mint on L2)

approach:

  1. in validate process, add replay checking. We need to record failed_relay_msg in contract. If failed_realy_msg[hash(tx)] = true, we could replay it. It makes me confused that in Optimism, they hash the gas_amount, if we do that too, how could we replay it by increasing gas? After increasing gas amount, we shouldn't find failed_realy_msg[hash(tx)] = true
  2. checking it outside contract (rooch_node may cheat, and other node may have inconsistent state if we will have decentralized deployment)
  3. checking it in user's contract, we could provide module and functions but we could not control user's behaviors (we may use compiler/MoveVM to make such a promise).
jolestar commented 8 months ago
  1. L1ToL2 tx does not need to pay gas because the gas has been charged on L1. The attacker can not use the L1ToL2 tx to DDoS.
  2. If the tx aborts on other conditions, there are bugs in the contract. We need to fix the bug and upgrade the contract.
  3. We need to record every successfully progressed L1ToL2 tx hash in the contract to avoid the replay attack.
templexxx commented 8 months ago

Gas is burn on l1 it doesn't mean there is no gas needed on l2. The gas burn on l1 is according the estimate on l1. But what if user just give a very small min_gas_amount(which is l2 to cost) but call a very large function on l2. L2 need to check users has enough gas and ask user replay it with higher gas(also means burn more on l1) @jolestar

templexxx commented 8 months ago
  1. L1ToL2 tx does not need to pay gas because the gas has been charged on L1. The attacker can not use the L1ToL2 tx to DDoS.

  2. If the tx aborts on other conditions, there are bugs in the contract. We need to fix the bug and upgrade the contract.

  3. We need to record every successfully progressed L1ToL2 tx hash in the contract to avoid the replay attack.

For 3: the main issue here is after validating the event and get the move_action(user contract), the execution begins. So where to put the contract storing successful tx

jolestar commented 8 months ago

Gas is burn on l1 it doesn't mean there is no gas needed on l2. The gas burn on l1 is according the estimate on l1. But what if user just give a very small min_gas_amount(which is l2 to cost) but call a very large function on l2. L2 need to check users has enough gas and ask user replay it with higher gas(also means burn more on l1) @jolestar

If this condition, the OutOfGas status tx should be treated as progressed successfully. The user needs to change the max_gas_amount(I still think the min_gas_amount should be max_gas_amount in L2 tx) and send it again.

jolestar commented 8 months ago

For 3: the main issue here is after validating the event and get the move_action(user contract), the execution begins. So where to put the contract storing successful tax

In the post_execute function?

templexxx commented 8 months ago

Gas is burn on l1 it doesn't mean there is no gas needed on l2. The gas burn on l1 is according the estimate on l1. But what if user just give a very small min_gas_amount(which is l2 to cost) but call a very large function on l2. L2 need to check users has enough gas and ask user replay it with higher gas(also means burn more on l1) @jolestar

If this condition, the OutOfGas status tx should be treated as progressed successfully. The user needs to change the max_gas_amount(I still think the min_gas_amount should be max_gas_amount in L2 tx) and send it again.

It will be easier to estimate gas cost on l1 according to min gas amout. The min gas amount is just the target on l2's cost, so what we need to do on l1 is just adding extra process gas cost. If we use max amount, how to estimate l2 cost on l1?

templexxx commented 8 months ago

In theory, retryable messaging is important when there is assert attached on it: L1 has locked assert, but mint is failed (because too low gas) on L2.

In practice, the default gas amount is far more than enough for assert transfer. We could add this mechanism after the whole data flow finished. Because it's a condition, it doesn't affect the main branch. @emptinesssubodhi