cosmos / cosmos-sdk

:chains: A Framework for Building High Value Public Blockchains :sparkles:
https://cosmos.network/
Apache License 2.0
6.07k stars 3.49k forks source link

fail to migrate auth module when upgrade to cosmos-sdk 0.46 #13314

Closed yihuang closed 1 year ago

yihuang commented 1 year ago

Summary of Bug

When upgrading our testnet to cosmos-sdk 0.46, two users report the following error, while the other nodes work.

There's a nil value when iterating accounts, not sure if it's accidental db corruption, or some real issue.

A quick and dirty solution could be to ignore the nil value during migrating, but it might cause an app hash mismatch instead.

Sep 14 17:50:50 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF applying upgrade "v0.9.0" at height: 5138880
Sep 14 17:50:50 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module authz from version 1 to version 2
Sep 14 17:50:50 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module bank from version 2 to version 3
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module evm from version 1 to version 2
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module evm from version 2 to version 3
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module feegrant from version 1 to version 2
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF adding a new module: feeibc
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module feemarket from version 1 to version 2
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module feemarket from version 2 to version 3
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module gov from version 2 to version 3
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module staking from version 2 to version 3
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module transfer from version 1 to version 2
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module upgrade from version 1 to version 2
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: 5:50PM INF migrating module auth from version 2 to version 3
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: panic: value is nil
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: goroutine 1 [running]:
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/store/types.AssertValidValue(...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/store/types/validity.go:13
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/store/gaskv.(*Store).Set(0x4001462cb0?, {0x40369937c0?, 0x99?, 0x99?}, {0x0?, 0x40396d95f0?, 0x40264e42d0?})
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/store/gaskv/store.go:49 +0x118
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/x/auth/migrations/v046.mapAccountAddressToAccountID({{0x345b060, 0x40001a8000}, {0x346bca8, 0x4026521f80}, {{0xb, 0x0}, {0x40264e42d0, 0x13}, 0x4e69c0, {0x1c42fc6f, ...}, ...}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/x/auth/migrations/v046/store.go:20 +0x134
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/x/auth/migrations/v046.MigrateStore(...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/x/auth/migrations/v046/store.go:30
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/x/auth/keeper.Migrator.Migrate2to3({{{0x3440548, 0x400644f3b0}, {0x3469f68, 0x4001462cb0}, {{0x3469f68, 0x4001462cb0}, 0x4000010f88, {0x3440548, 0x400644f420}, {0x3440598, ...}, ...}, ...}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/x/auth/keeper/migrations.go:50 +0x54
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/types/module.configurator.runModuleMigrations({{_, _}, {_, _}, {_, _}, _}, {{0x345b060, 0x40001a8000}, {0x346bca8, ...}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/types/module/configurator.go:110 +0x28c
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/types/module.Manager.RunMigrations({_, {_, _, _}, {_, _, _}, {_, _, _}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/types/module/module.go:452 +0x3bc
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/crypto-org-chain/cronos/app.(*App).RegisterUpgradeHandlers.func1({{0x345b060, 0x40001a8000}, {0x346bca8, 0x4026521f80}, {{0xb, 0x0}, {0x40264e42d0, 0x13}, 0x4e69c0, {0x1c42fc6f, ...}, ...}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/app/upgrades.go:17 +0xd4
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/x/upgrade/keeper.Keeper.ApplyUpgrade({{_, _}, _, {_, _}, {_, _}, _, {_, _}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/x/upgrade/keeper/keeper.go:337 +0x128
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/x/upgrade.BeginBlocker({{_, _}, _, {_, _}, {_, _}, _, {_, _}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/x/upgrade/abci.go:62 +0x4c4
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/x/upgrade.AppModule.BeginBlock(...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/x/upgrade/module.go:134
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/types/module.(*Manager).BeginBlock(_, {{0x345b060, 0x40001a8000}, {0x346bca8, 0x4026521f80}, {{0xb, 0x0}, {0x40264e42d0, 0x13}, 0x4e69c0, ...}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/types/module/module.go:481 +0x2c8
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/crypto-org-chain/cronos/app.(*App).BeginBlocker(_, {{0x345b060, 0x40001a8000}, {0x346bca8, 0x4026521f80}, {{0xb, 0x0}, {0x40264e42d0, 0x13}, 0x4e69c0, ...}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/app/app.go:776 +0x64
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).BeginBlock(_, {{0x4026522320, 0x20, 0x20}, {{0xb, 0x0}, {0x40264e42d0, 0x13}, 0x4e69c0, {0x1c42fc6f, ...}, ...}, ...})
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/baseapp/abci.go:200 +0x664
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/tendermint/tendermint/abci/client.(*localClient).BeginBlockSync(_, {{0x4026522320, 0x20, 0x20}, {{0xb, 0x0}, {0x40264e42d0, 0x13}, 0x4e69c0, {0x1c42fc6f, ...}, ...}, ...})
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/tendermint/tendermint/abci/client/local_client.go:280 +0x110
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/tendermint/tendermint/proxy.(*appConnConsensus).BeginBlockSync(_, {{0x4026522320, 0x20, 0x20}, {{0xb, 0x0}, {0x40264e42d0, 0x13}, 0x4e69c0, {0x1c42fc6f, ...}, ...}, ...})
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/tendermint/tendermint/proxy/app_conn.go:81 +0x50
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/tendermint/tendermint/state.execBlockOnProxyApp({0x345c4b8?, 0x4031a8ec00}, {0x3463978, 0x40252f7760}, 0x4000dfcd20, {0x346b3b0, 0x40c64c4678}, 0x4e69bf?)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/tendermint/tendermint/state/execution.go:307 +0x2c8
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/tendermint/tendermint/state.(*BlockExecutor).ApplyBlock(_, {{{0xb, 0x0}, {0x40252f1a50, 0x7}}, {0x4024d20168, 0x13}, 0x1, 0x4e69bf, {{0x4024e58200, ...}, ...}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/tendermint/tendermint/state/execution.go:140 +0xf4
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/tendermint/tendermint/consensus.(*Handshaker).replayBlock(_, {{{0xb, 0x0}, {0x40252f1a50, 0x7}}, {0x4024d20168, 0x13}, 0x1, 0x4e69bf, {{0x4024e58200, ...}, ...}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/tendermint/tendermint/consensus/replay.go:503 +0x19c
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/tendermint/tendermint/consensus.(*Handshaker).ReplayBlocks(_, {{{0xb, 0x0}, {0x40252f1a50, 0x7}}, {0x4024d20168, 0x13}, 0x1, 0x4e69bf, {{0x4024e58200, ...}, ...}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/tendermint/tendermint/consensus/replay.go:416 +0x624
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/tendermint/tendermint/consensus.(*Handshaker).Handshake(0x4006c97630, {0x346c820, 0x40018a4410})
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/tendermint/tendermint/consensus/replay.go:268 +0x3a8
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/tendermint/tendermint/node.doHandshake({_, _}, {{{0xb, 0x0}, {0x40252f1a50, 0x7}}, {0x4024d20168, 0x13}, 0x1, 0x4e69bf, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/tendermint/tendermint/node/node.go:330 +0x120
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/tendermint/tendermint/node.NewNode(0x400037c500, {0x3457620, 0x400194bf40}, 0x4006cbcaa0, {0x3439018, 0x4022e88ba0}, 0x3438ad8?, 0x40017d2340?, 0x4006cbcc60, {0x345c4b8, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/tendermint/tendermint/node/node.go:778 +0x3fc
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/evmos/ethermint/server.startInProcess(_, {{0x0, 0x0, 0x0}, {0x34756c0, 0x4000d370b0}, 0x0, {0x0, 0x0}, {0x346f570, ...}, ...}, ...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/evmos/ethermint/server/start.go:310 +0xadc
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/evmos/ethermint/server.StartCmd.func2(0x40018a6780?, {0x4001791c80?, 0x0?, 0x2?})
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/evmos/ethermint/server/start.go:118 +0x190
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/spf13/cobra.(*Command).execute(0x40018a6780, {0x4001791c60, 0x2, 0x2})
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/spf13/cobra/command.go:872 +0x4e8
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/spf13/cobra.(*Command).ExecuteC(0x4001228c80)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/spf13/cobra/command.go:990 +0x360
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/spf13/cobra.(*Command).Execute(...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/spf13/cobra/command.go:918
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/spf13/cobra.(*Command).ExecuteContext(...)
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/spf13/cobra/command.go:911
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: github.com/cosmos/cosmos-sdk/server/cmd.Execute(0x1e8cdc8?, {0x239fe64, 0x6}, {0x400147e2e8, 0x14})
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/vendor/github.com/cosmos/cosmos-sdk/server/cmd/execute.go:36 +0x1a8
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]: main.main()
Sep 14 17:50:58 cronos-testnet-archive-node-1 cronosd[18691]:         /build/13z5rjyr710sm41ikapazvyv4xfmd8gs-source/cmd/cronosd/main.go:13 +0x48

Version

0.46

Steps to Reproduce

yihuang commented 1 year ago

It's because using wrong network version of binary, since address is stored as bech32 strings in db, so with a wrong binary, the BaseAccount.GetAddress fail to decode, then ignores the decode error and return an empty one.

Segfaultd commented 1 year ago

Hello @yihuang would mind mentioning exactly what was your solution? Encountering exact same issue here with v0.46.6. Thanks!