aurora-is-near / aurora-engine

⚙️ Aurora Engine implements an Ethereum Virtual Machine (EVM) on the NEAR Protocol.
https://doc.aurora.dev/develop/compat/evm
325 stars 78 forks source link

Pause Contract and Resume Contract #779

Closed Casuso closed 1 year ago

Casuso commented 1 year ago

Description

Feature to pause and resume the engine contract.

We added two public methods to pause and resume the engine contract. Only DAO can use these functionalities. We use an Engine State field as a flag to indicate paused state. All existing contract public methods check this flat before proceeding, and fail if contract is paused.

Performance / NEAR gas cost considerations

All public methods would need to read the Engine State now, but no impact on performance was visible.

Testing

Added new tests to cover the DAO requirement and the feature effect.

How to review

Please, take a look at the public methods on the contract: -All set methods should be affected by pausing. -All get methods should NOT be affected by pausing.

Additional information

This feature came as a necessity for the Aurora Hashchain integration. The decided strategy is to:

  1. upgrade the engine with the Hashchain inactive.
  2. pause the contract.
  3. call a special contract method added by the Hashchain upgrade, to pass the seed computed Hashchain value for the block, and to activate the Hashchain.
  4. resume the contract.
  5. upgrade the engine to remove unnecessary code and the special method.

There are more details about this but this is the general strategy.

Casuso commented 1 year ago

It seems that procedural macro #[required_running] might be a better solution rather than writing in every function: require_running(&state::get_state(&io).sdk_unwrap());. Also, I have a question about view methods. Why do we need to stop them if they don't modify the state?

The procedural macro is definitively a nice idea. The drawback is that multiple methods already need to use the Engine State, so we will end it up reading it two times. That is why in this PR I use two mechanisms:

  1. For methods that are not already using the Engine State: require_running(&state::get_state(&io).sdk_unwrap());

  2. For methods that are already using the Engine State: let state = state::get_state(&io).sdk_unwrap(); require_running(&state);

About the view methods, I agree that it is not necessary to pause them. I did it in all methods for the sake of leaving the contract in a concise tight "state", but we can change that for sure. I will prefer that @joshuajbouw and others share their thoughts on this too.

Casuso commented 1 year ago

Pausing view methods is overkill and might be actually harmful. Imagine the situation where the engine needs to be paused, after that, it makes it hard to observe the state of the engine even for internal purposes and to prepare for the unpausing workflow. Also, this might make an RPC fully down, not being able to read any information. I think the purpose of pausing is to make the contract read-only but not suspend all the operations.

Thanks for the feedback. Got also the same feedback from @joshuajbouw so I will change this behave.

Casuso commented 1 year ago

I made the changes so get methods are not affected by pausing. Please @aleksuss, @sept-en and @joshuajbouw, take a look at the list and let me know if you see an incorrect behave:

Yes(Y)/No(N) affected by pausing - method name - possible comment:

N - new - not a get method but seams reasonable not to be affected by pausing. N - get_version N - get_owner Y - set_owner N - get_bridge_prover N - get_chain_id N - get_upgrade_delay_blocks Y - set_upgrade_delay_blocks N - get_upgrade_index Y - stage_upgrade Y - deploy_upgrade N - state_migration - this method has no current implementation. Y - resume_precompiles Y - pause_precompiles N - paused_precompiles Y - pause_contract - this is the added method for pausing, requires running Y - resume_contract - this is the added method for resuming, requires paused Y - deploy_code Y - call Y - submit Y - submit_with_args Y - register_relayer Y - factory_update Y - factory_update_address_version Y - factory_set_wnear_address Y - fund_xcc_sub_account Y - ft_on_transfer Y - deploy_erc20_token Y - refund_on_error N - view N - get_block_hash N - get_code N - get_balance N - get_nonce N - get_storage_at N - begin_chain - this is behind #[cfg(feature = "evm_bully")] N - begin_block - this is behind #[cfg(feature = "evm_bully")] Y - new_eth_connector Y - set_eth_connector_contract_data Y - withdraw Y - deposit Y - finish_deposit N - is_used_proof N - ft_total_supply N - ft_total_eth_supply_on_near N - ft_total_eth_supply_on_aurora N - ft_balance_of N - ft_balance_of_eth Y - ft_transfer Y - ft_resolve_transfer Y - ft_transfer_call Y - storage_deposit Y - storage_unregister Y - storage_withdraw N - storage_balance_of N - get_paused_flags Y - set_paused_flags N - get_accounts_counter N - get_erc20_from_nep141 N - get_nep141_from_erc20 N - ft_metadata N - verify_log_entry - this is behind #[cfg(feature = "integration-test")] N - mint_account - this is behind #[cfg(feature = "integration-test")]

Casuso commented 1 year ago

New types of transactions should be added to the standalone. Take a look at this enum.

Thanks for highlighting this @aleksuss! Please take a look.

I'm not enforcing the pause/resume rules on the standalone. I think this is correct comparing to other requirements that are not enforced there. Let me know if this is not right.

aleksuss commented 1 year ago

As I know, we should add all kinds of transactions to the standalone for having the same state on the chain and in the borealis infrastructure. cc @birchmd @joshuajbouw

Casuso commented 1 year ago

As I know, we should add all kinds of transactions to the standalone for having the same state on the chain and in the borealis infrastructure. cc @birchmd @joshuajbouw

Maybe I didn't explain myself @aleksuss. I did add the Pause and Resume txs to the standalone so the state is coherent. What I didn't add was the rules that comes with pausing/resuming. E.g, if you call pause and then call submit, we will not check for the pausing state. This is only in the standalone, in the contract side, we are enforcing such rules.

I did not enforce these rules based on examples of other rules that are not enforced on the standalone. E.g, calling set_owner does not require any specific current account owner.

Is this correct?

aleksuss commented 1 year ago

@Casuso Take a look, please, at this method.

Casuso commented 1 year ago

@Casuso Take a look, please, at this method.

Talked with @aleksuss about this. It was not necessary for the standalone or pause tests but it is definitely nice to have. This way we are ready in case future tests need some other pausing flow.

Thanks for all the reviews @aleksuss!!! PR is better now for sure!