near / nearcore

Reference client for NEAR Protocol
https://near.org
GNU General Public License v3.0
2.31k stars 618 forks source link

test_utils::upgrade_protocol bumps to the latest protocol version instead of the provided one #8590

Open Ekleog-NEAR opened 1 year ago

Ekleog-NEAR commented 1 year ago

The code at https://github.com/ekleog/nearcore/commit/33bf68080dd25bbb50c68ba1068b9f217b88cdca fails with log lines:

Testing protocol upgrade between 47 and 48
Producing block with version 48
thread 'tests::client::features::wasmer2::test_wasmer2_upgrade' panicked at 'Expected to find 'vm_kind=Wasmer2' in logs, occurences of vm_kind are ["run code.len=99650 method_name=log_something vm_kind=NearVm current_protocol_version=58"]', integration-tests/src/tests/client/features/wasmer2.rs:84:5

Note in particular the current_protocol_version=58 here.

Analysis suggested by @jakmeier is as follows:

My theory why it doesn't work: Could it be that when we set the old version in genesis config, then block producers still set the latest protocol version to the latest version they support? That's at least what I would expect to happen. And then they already schedule a protocol upgrade right up to the latest version before the call to upgrade_protocol happens. The loop that fast-forwards through 2 epochs inside upgrade_protocol will then make it happen, regardless of what that one block that has latest version 48.

Longarithm commented 1 year ago

We may need to disable voting for latest protocol version in tests, because we do want to check only +1 protocol version switches. With flat storage features, TTN costs of same txns will change, which may break some existing tests using upgrade_protocol which is the right thing to use.

Ekleog-NEAR commented 1 year ago

Alternatively, if we go forward with limited-to-three replayability for the outside world, we can probably just remove all the tests that pertain to versions older than the latest we officially support, and that’ll both simplify our code and help us work around this issue :) (though it’d still be nice to actually fix it someday)

Longarithm commented 8 months ago

Got hit by this while adding CI for chain with custom protocol version https://github.com/near/nearcore/pull/10589

akhi3030 commented 6 months ago

Looks like we are hitting this issue again in #10979. If we are not planning on fixing this issue any time soon, I propose that we change https://github.com/near/nearcore/blob/4c0aa98cb7ecd5fd2a22c87788aa56d9223b7abb/chain/client/src/test_utils/test_env.rs#L438 to stop taking a protocol_version as an argument. We will then be forced to rewrite our tests to accept this limitation.

wacban commented 3 months ago

JFYI I'm going to disable the test_function_limit_change test when stabilizing the cc and sv features.

If the issue is uncontrolled protocol upgrades this can now be done more precisely with NEAR_TESTS_PROTOCOL_UPGRADE_OVERRIDE.