Near-One / near-plugins

Implementation of common patterns used for NEAR smart contracts.
Creative Commons Zero v1.0 Universal
27 stars 12 forks source link

`Upgradable` parameter and result serialization of code #77

Open mooori opened 1 year ago

mooori commented 1 year ago

Upgradable::up_stage_code uses borsh parameter serialization:

https://github.com/aurora-is-near/near-plugins/blob/c043add4c2a0810872c4326a55c5162bccd3f4ee/near-plugins-derive/src/upgradable.rs#L33

However, this isn’t documented:

https://github.com/aurora-is-near/near-plugins/blob/1cf0aa61e38d3b02803642983795c927598375ba/near-plugins/src/upgradable.rs#L56-L58

So most likely users will assume default json parameter serialization and then wonder why deserialization fails when calling up_stage_code in a transaction.


Either we should document this or change up_stage_code to use json serialization. If I remember correctly, this is the only method provided by near-plugins which does not use json parameter serialization. Even if less efficient, perhaps we should switch to json here for a consistent and intuitive API?

Edit: Another advantage of json is that it allows using near-cli to stage code, which is problematic with borsh as described here (section up_stage_code).

cc @birchmd

mooori commented 1 year ago

Also the borsh serialization of up_staged_code isn’t documented:

https://github.com/aurora-is-near/near-plugins/blob/1cf0aa61e38d3b02803642983795c927598375ba/near-plugins/src/upgradable.rs#L60-L61

https://github.com/aurora-is-near/near-plugins/blob/c043add4c2a0810872c4326a55c5162bccd3f4ee/near-plugins-derive/src/upgradable.rs#L41-L44

Since it’s related to up_stage_code, I think we can include it in this issue and handle it in the same way (document or switch to json).

birchmd commented 1 year ago

Good catch @mooori . I think it makes sense to change the serialization to use JSON (bytes encoded as base64) to be consistent with everything else. If the efficiency is really important to some use-case they can always override the default implementation correct?

mooori commented 1 year ago

It surfaced in integration tests and surprised me there.

If someone implements the Upgradable trait themselves, they can choose another serialization protocol. Using #[derive(Upgradable)] and overriding just a single trait method might be tricky or even impossible.

Another simple way to use borsh instead of json would be adding a wrapper like this to the contract:

pub fn borsh_up_stage_code(&mut self, #[serializer(borsh)] code: Vec<u8>) {
    self.up_stage_code(code);
}

Given these options I think for now it’s fine to switch to json. If efficiency of these methods becomes a frequent issue, we could let users configure the serialization protocol via a macro attribute:

#[derive(Upgradable)]
#[upgradable(code_serialization(borsh))
pub struct Contract { /* ... */ }
karim-en commented 8 months ago

@birchmd @mooori We encountered with bad user experience when using the DAO to upgrade our contract with the plugin due to the borsh.

Some background:

  1. To upgrade the contract with DAO the developer needs to store the blob of binary in the DAO, but the binary should be encoded in the borsh format so (4 bytes vec len + binary bytes)

    go install github.com/JonathanLogan/nearcall/cmd/nearcall@0.0.3
    $HOME/go/bin/nearcall -argfile res/bridge_token_factory.wasm -method store_blob -network mainnet -receiver rainbowbridge.sputnik-dao.near -signer SOME_SIGNER.near -out borsh -deposit 10000000000000000000000000
  2. Then a proposal should be created

    NEAR_ENV=mainnet near call rainbowbridge.sputnik-dao.near add_proposal '{"proposal":{"description":"Bridge factory 0.2.2","kind":{"UpgradeRemote":{"receiver_id":"factory.bridge.near","method_name":"up_stage_code","hash":"ELCgdcNckX3ZjrjRdfKXrX8duGTL1S1LSgbDCWv2RzA"}}}}' --gas 300000000000000 --accountId SOME_SIGNER.near --deposit 1 --useLedgerKey

The problem here is that the hash is not of the binary wasm file, but it should be the hash of the serialized borsh bytes. So The DAO signers will need to use some extra tool to validate that the hash of the binary is the same as the one that stored in the repo or release links. To fix this, I would recommend to read the bytes directly from the env::input let code = env::input().unwrap_or_else(|| panic!("ERR_NO_INPUT"));.

But there is another solution, we can completely avoid the dao upgrade logic, to be able to do that, we need to add an additional Option argument hash to the fn up_deploy_code(&mut self) -> Promise; so the DAO will be able to execute the deploy proposal only if the hashes are much.

mooori commented 8 months ago

@karim-en thanks for reporting the issue and proposing solutions!

read the bytes directly from the env::input

When env::input() is used internally, extending up_stage_code to have other arguments besides code would become tricky. Since then we would need to find a way to distinguish which bytes from env::input() are code and which bytes are other arguments.

add an additional Option argument hash to the fn up_deploy_code

Just to double check that I understood correctly, the proposal is to modify up_deploy_code as follows?

fn up_deploy_code(&mut self, function_call_args: Option<FunctionCallArgs>, hash: Option<String>) {
    //...
    if let Some(hash) = hash {
        // Panic if `hash` does not correspond to the hash of the staged code
    }
    //...
}

That would sound like a reasonable approach to me.

karim-en commented 8 months ago

@mooori Yes, exactly. Also, another topic related to damaging the contract due to storage layout changes, I haven't tested this scenario yet, but I think that the upgradeable plugin will panic if the storage layout is accidentally changed, I propose finding a way to ignore the state when interacting with upgradable methods, eg use #[init(ignore_state)] or something else to ignore the storage state.

mooori commented 8 months ago

My understanding of #[init] is that it is supposed to be used on initialization and migration methods only. Though documentation on #[init] is sparse, so maybe I’m missing something.

The upgradable plugin provides a mechanism to handle state migrations: The function up_deploy_code takes an optional function_call_args parameter that allows to specify a function call to be batched with the deployment of the new code. If that function call fails, the deployment is rolled back and the old code remains active.

The attached call can be to a migration function which uses #[init(ignore_state)]. These two tests show it in action, migrating state once successfully and once with failure. The called migration functions are located here.

@karim-en I noticed you mentioned the function signature fn up_deploy_code(&mut self) without the parameter function_call_args. Perhaps you are using a previous version of near-plugins that did not yet have this migration and rollback mechanism.

karim-en commented 8 months ago

@mooori Yes, that is true that the #[init] is related to the initialization, but I guess that the second part of this macro ignore_state is what we need, I think we can expand the contract to see how #[init(ignore_state)] is implemented to ignore the state.

Yeah, I am aware of the deploy & migrate functionally, Astro dao also has a similar approach, but this doesn't protect us from deploying a problematic contract by mistake, if the full access keys were removed and there is no correct migration function then there is no way to recover the contract.

karim-en commented 8 months ago

Here is the expand of this function:

    #[private]
    #[init(ignore_state)]
    pub fn migrate() -> Self {
    }
#[cfg(target_arch = "wasm32")]
#[no_mangle]
pub extern "C" fn migrate() {
    near_sdk::env::setup_panic_hook();
    if near_sdk::env::current_account_id() != near_sdk::env::predecessor_account_id() {
        near_sdk::env::panic_str("Method migrate is private");
    }
    if near_sdk::env::attached_deposit() != 0 {
        near_sdk::env::panic_str("Method migrate doesn't accept deposit");
    }
    let contract = FastBridge::migrate();
    near_sdk::env::state_write(&contract);
}

As you can see the generated code doesn't contain near_sdk::env::state_read().unwrap_or_default(); so the near-sdk just skip generating this line of code. So we can use this kind of raw function like in the Aurora engine pub extern "C" fn ....

Btw, if it is problematic to do that in the plugins themselves, then we could add this requirement just to the attach_full_access_key. But the problem here is that attach_full_access_key itself uses the access_control_any so it could fail on access check step.

mooori commented 8 months ago

Yeah, I am aware of the deploy & migrate functionally, Astro dao also has a similar approach, but this doesn't protect us from deploying a problematic contract by mistake, if the full access keys were removed and there is no correct migration function then there is no way to recover the contract.

A failure of the migration function causes the upgrade to roll back if the migration function is called via the function_call_args of up_deploy_code.

As you can see the generated code doesn't contain near_sdk::env::state_read().unwrap_or_default(); so the near-sdk just skip generating this line of code.

Such an approach in near-plugins would be tricky to maintain: To know what ignore_state does we would need to monitor the source code of near-sdk or compare macro expansions of #[init] and #[init(ignore_state)]. Applying the ignore_state behavior would also be complicated by some plugins building on top of others, as you mentioned.

So we can use this kind of raw function like in the Aurora engine pub extern "C" fn ....

Currently we expect plugins to be used within implementation blocks with #[near-bindgen]. However using pub extern "C" fn ... inside near-bindgen blocks is not allowed:

error: Contract API is not allowed to have binary interface.
  --> src/lib.rs:61:9
   |
61 |     pub extern "C" fn foobar() {}
   |         ^^^^^^

With raw functions like that, using near-plugins would become more difficult, as sometimes it needs to be within implementation blocks with near-bindgen, sometimes outside of them.