cosmos / mainnet

It's happening!
128 stars 250 forks source link

panic when trying to start gaiad #148

Closed odeke-em closed 3 years ago

odeke-em commented 3 years ago

After having followed instructions on the hub https://hub.cosmos.network/main/gaia-tutorials/join-mainnet.html, trying to start gaiad just gives an obscure panic

$ gaiad start
I[2021-03-02|01:54:08.974] starting ABCI with Tendermint                module=main 
panic: JSON encoding of interfaces require non-empty type field.

goroutine 1 [running]:
github.com/tendermint/go-amino.(*Codec).MustUnmarshalJSON(0xc000dba0e0, 0xc0020a6000, 0x1bd2c44, 0x1bd4000, 0x4ed6c80, 0xc001026180)
    github.com/tendermint/go-amino@v0.15.1/amino.go:445 +0x98
github.com/cosmos/cosmos-sdk/x/auth.AppModule.InitGenesis(0x5460c18, 0xc000141c20, 0x5439b78, 0xc00047f970, 0x5460c18, 0xc000141c20, 0x545d9c0, 0xc000141c20, 0x5439b78, 0xc00047f9e0, ...)
    github.com/cosmos/cosmos-sdk@v0.34.4-0.20200511222341-80be50319ca5/x/auth/module.go:120 +0xa4
github.com/cosmos/cosmos-sdk/types/module.(*Manager).InitGenesis(0xc000dbaaf0, 0x5449e48, 0xc0000de008, 0x545b710, 0xc00059aa80, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    github.com/cosmos/cosmos-sdk@v0.34.4-0.20200511222341-80be50319ca5/types/module/module.go:267 +0x305
github.com/cosmos/gaia/app.(*GaiaApp).InitChainer(0xc000dc3c00, 0x5449e48, 0xc0000de008, 0x545b710, 0xc00059aa80, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    github.com/cosmos/gaia/app/app.go:378 +0x15b
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).InitChain(0xc000449cc0, 0x0, 0xed5830c36, 0x0, 0xc00019e130, 0xb, 0xc00059a9c0, 0xc000e9e000, 0x7d, 0x7d, ...)
    github.com/cosmos/cosmos-sdk@v0.34.4-0.20200511222341-80be50319ca5/baseapp/abci.go:40 +0x2f8
github.com/tendermint/tendermint/abci/client.(*localClient).InitChainSync(0xc00001cd80, 0x0, 0xed5830c36, 0x0, 0xc00019e130, 0xb, 0xc00059a9c0, 0xc000e9e000, 0x7d, 0x7d, ...)
    github.com/tendermint/tendermint@v0.33.4/abci/client/local_client.go:223 +0x115
github.com/tendermint/tendermint/proxy.(*appConnConsensus).InitChainSync(0xc000467880, 0x0, 0xed5830c36, 0x0, 0xc00019e130, 0xb, 0xc00059a9c0, 0xc000e9e000, 0x7d, 0x7d, ...)
    github.com/tendermint/tendermint@v0.33.4/proxy/app_conn.go:65 +0x75
github.com/tendermint/tendermint/consensus.(*Handshaker).ReplayBlocks(0xc000eb2f50, 0xa, 0x0, 0xc000dc8250, 0x6, 0xc000dc8270, 0xb, 0x0, 0x0, 0x0, ...)
    github.com/tendermint/tendermint@v0.33.4/consensus/replay.go:319 +0x696
github.com/tendermint/tendermint/consensus.(*Handshaker).Handshake(0xc000eb2f50, 0x545db40, 0xc0002ac180, 0x80, 0x80)
    github.com/tendermint/tendermint@v0.33.4/consensus/replay.go:269 +0x4bc
github.com/tendermint/tendermint/node.doHandshake(0x545cfe0, 0xc000010490, 0xa, 0x0, 0xc000dc8250, 0x6, 0xc000dc8270, 0xb, 0x0, 0x0, ...)
    github.com/tendermint/tendermint@v0.33.4/node/node.go:283 +0x19b
github.com/tendermint/tendermint/node.NewNode(0xc000e5ab40, 0x5444a00, 0xc000459860, 0xc00047ff30, 0x5427060, 0xc00002cc00, 0xc0010140f0, 0x52b1918, 0xc001014100, 0x544a890, ...)
    github.com/tendermint/tendermint@v0.33.4/node/node.go:606 +0x3b6
github.com/cosmos/cosmos-sdk/server.startInProcess(0xc00000e810, 0x52b3070, 0x1d, 0x0)
    github.com/cosmos/cosmos-sdk@v0.34.4-0.20200511222341-80be50319ca5/server/start.go:169 +0x4cc
github.com/cosmos/cosmos-sdk/server.StartCmd.func2(0xc000e84b00, 0x5cceea8, 0x0, 0x0, 0x0, 0x0)
    github.com/cosmos/cosmos-sdk@v0.34.4-0.20200511222341-80be50319ca5/server/start.go:75 +0xb6
github.com/spf13/cobra.(*Command).execute(0xc000e84b00, 0x5cceea8, 0x0, 0x0, 0xc000e84b00, 0x5cceea8)
    github.com/spf13/cobra@v1.0.0/command.go:842 +0x472
github.com/spf13/cobra.(*Command).ExecuteC(0xc000103080, 0xc000e839a0, 0x5140e8e, 0xc00071fe28)
    github.com/spf13/cobra@v1.0.0/command.go:950 +0x37e
github.com/spf13/cobra.(*Command).Execute(...)
    github.com/spf13/cobra@v1.0.0/command.go:887
github.com/tendermint/tendermint/libs/cli.Executor.Execute(0xc000103080, 0x52b34f8, 0x5121608, 0x10)
    github.com/tendermint/tendermint@v0.33.4/libs/cli/setup.go:89 +0x3c
main.main()
    github.com/cosmos/gaia/cmd/gaiad/main.go:70 +0x7d0

I need to be able to start gaiad to profile it and figure out a few things, hence this issue is a blocker. I had raised it with @okwme and @ethanfrey as well as reported on the Cosmos Discord. Thank you.

okwme commented 3 years ago

your gaiad start command doens't seem to have the seeds listed in the command, was that something you added to your config instead or is it missing?

gaiad start --p2p.seeds bf8328b66dceb4987e5cd94430af66045e59899f@public-seed.cosmos.vitwit.com:26656,cfd785a4224c7940e9a10f6c1ab24c343e923bec@164.68.107.188:26656,d72b3011ed46d783e369fdf8ae2055b99a1e5074@173.249.50.25:26656,ba3bacc714817218562f743178228f23678b2873@public-seed-node.cosmoshub.certus.one:26656,3c7cad4154967a294b3ba1cc752e40e8779640ad@84.201.128.115:26656
odeke-em commented 3 years ago

@okwme I had tried that too early and same failure, here it is again with your command

$ gaiad start --p2p.seeds bf8328b66dceb4987e5cd94430af66045e59899f@public-seed.cosmos.vitwit.com:26656,cfd785a4224c7940e9a10f6c1ab24c343e923bec@164.68.107.188:26656,d72b3011ed46d783e369fdf8ae2055b99a1e5074@173.249.50.25:26656,ba3bacc714817218562f743178228f23678b2873@public-seed-node.cosmoshub.certus.one:26656,3c7cad4154967a294b3ba1cc752e40e8779640ad@84.201.128.115:26656
I[2021-03-02|02:26:50.471] starting ABCI with Tendermint                module=main 
panic: JSON encoding of interfaces require non-empty type field.

goroutine 1 [running]:
github.com/tendermint/go-amino.(*Codec).MustUnmarshalJSON(0xc0002181c0, 0xc00b932000, 0x1bd2c44, 0x1bd4000, 0x4ed6c80, 0xc000f59ac0)
    github.com/tendermint/go-amino@v0.15.1/amino.go:445 +0x98
github.com/cosmos/cosmos-sdk/x/auth.AppModule.InitGenesis(0x5460c18, 0xc001040a20, 0x5439b78, 0xc000393800, 0x5460c18, 0xc001040a20, 0x545d9c0, 0xc001040a20, 0x5439b78, 0xc000393870, ...)
    github.com/cosmos/cosmos-sdk@v0.34.4-0.20200511222341-80be50319ca5/x/auth/module.go:120 +0xa4
github.com/cosmos/cosmos-sdk/types/module.(*Manager).InitGenesis(0xc000218ee0, 0x5449e48, 0xc0001a0008, 0x545b710, 0xc0003b6580, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    github.com/cosmos/cosmos-sdk@v0.34.4-0.20200511222341-80be50319ca5/types/module/module.go:267 +0x305
github.com/cosmos/gaia/app.(*GaiaApp).InitChainer(0xc0002c0e00, 0x5449e48, 0xc0001a0008, 0x545b710, 0xc0003b6580, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    github.com/cosmos/gaia/app/app.go:378 +0x15b
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).InitChain(0xc000f50a00, 0x0, 0xed5830c36, 0x0, 0xc0010240d0, 0xb, 0xc0003b64c0, 0xc0003b8000, 0x7d, 0x7d, ...)
    github.com/cosmos/cosmos-sdk@v0.34.4-0.20200511222341-80be50319ca5/baseapp/abci.go:40 +0x2f8
github.com/tendermint/tendermint/abci/client.(*localClient).InitChainSync(0xc00017c360, 0x0, 0xed5830c36, 0x0, 0xc0010240d0, 0xb, 0xc0003b64c0, 0xc0003b8000, 0x7d, 0x7d, ...)
    github.com/tendermint/tendermint@v0.33.4/abci/client/local_client.go:223 +0x115
github.com/tendermint/tendermint/proxy.(*appConnConsensus).InitChainSync(0xc00103c480, 0x0, 0xed5830c36, 0x0, 0xc0010240d0, 0xb, 0xc0003b64c0, 0xc0003b8000, 0x7d, 0x7d, ...)
    github.com/tendermint/tendermint@v0.33.4/proxy/app_conn.go:65 +0x75
github.com/tendermint/tendermint/consensus.(*Handshaker).ReplayBlocks(0xc0003ccf50, 0xa, 0x0, 0xc000390038, 0x6, 0xc000390050, 0xb, 0x0, 0x0, 0x0, ...)
    github.com/tendermint/tendermint@v0.33.4/consensus/replay.go:319 +0x696
github.com/tendermint/tendermint/consensus.(*Handshaker).Handshake(0xc0003ccf50, 0x545db40, 0xc00022a580, 0x7f, 0x80)
    github.com/tendermint/tendermint@v0.33.4/consensus/replay.go:269 +0x4bc
github.com/tendermint/tendermint/node.doHandshake(0x545cfe0, 0xc000548018, 0xa, 0x0, 0xc000390038, 0x6, 0xc000390050, 0xb, 0x0, 0x0, ...)
    github.com/tendermint/tendermint@v0.33.4/node/node.go:283 +0x19b
github.com/tendermint/tendermint/node.NewNode(0xc0001e0dc0, 0x5444a00, 0xc000f6ff40, 0xc000260040, 0x5427060, 0xc00000e2e8, 0xc000260840, 0x52b1918, 0xc000260890, 0x544a890, ...)
    github.com/tendermint/tendermint@v0.33.4/node/node.go:606 +0x3b6
github.com/cosmos/cosmos-sdk/server.startInProcess(0xc00000e690, 0x52b3070, 0x1d, 0x0)
    github.com/cosmos/cosmos-sdk@v0.34.4-0.20200511222341-80be50319ca5/server/start.go:169 +0x4cc
github.com/cosmos/cosmos-sdk/server.StartCmd.func2(0xc000f72000, 0xc000f4ad40, 0x0, 0x2, 0x0, 0x0)
    github.com/cosmos/cosmos-sdk@v0.34.4-0.20200511222341-80be50319ca5/server/start.go:75 +0xb6
github.com/spf13/cobra.(*Command).execute(0xc000f72000, 0xc000f4ad20, 0x2, 0x2, 0xc000f72000, 0xc000f4ad20)
    github.com/spf13/cobra@v1.0.0/command.go:842 +0x472
github.com/spf13/cobra.(*Command).ExecuteC(0xc00018e840, 0xc000f6fcc0, 0x5140e8e, 0xc000537e28)
    github.com/spf13/cobra@v1.0.0/command.go:950 +0x37e
github.com/spf13/cobra.(*Command).Execute(...)
    github.com/spf13/cobra@v1.0.0/command.go:887
github.com/tendermint/tendermint/libs/cli.Executor.Execute(0xc00018e840, 0x52b34f8, 0x5121608, 0x10)
    github.com/tendermint/tendermint@v0.33.4/libs/cli/setup.go:89 +0x3c
main.main()
    github.com/cosmos/gaia/cmd/gaiad/main.go:70 +0x7d0
alexanderbez commented 3 years ago

Something is definitely off here. Can you:

  1. Confirm the version of gaiad you're running
  2. Confirm the genesis file hash you're using
odeke-em commented 3 years ago

@alexanderbez, please see below:

  1. Confirm the version of gaiad you're running
$ gaiad version --long
name: gaia
server_name: gaiad
client_name: gaiacli
version: goz-phase-1-32-g6dc7709
commit: 6dc7709b41df302350a8ec2d83df6cc35e9b1b57
build_tags: netgo,ledger
go: go version devel +4c1a7ab49c Tue Mar 2 03:46:25 2021 +0000 darwin/amd64
  1. Confirm the genesis file hash you're using

I got it from the official instructions from https://github.com/cosmos/mainnet/raw/master/genesis.cosmoshub-4.json.gz

$ md5 genesis.json 
MD5 (genesis.json) = aee2922ecac93c89906df9bcf8693d38
alexanderbez commented 3 years ago

version: goz-phase-1-32-g6dc7709 that does not look correct. You should be running v4.0.x, preferably v4.0.4.

odeke-em commented 3 years ago

That code is built from the latest commit on master, with the latest Go too. For clarity, are you suggesting that I shouldnโ€™t be building off the code on the master branch, but instead on a prior tagged release? If so, perhaps we should document this because the docs say we can build from source.

On Tue, Mar 2, 2021 at 5:51 AM Aleksandr Bezobchuk notifications@github.com wrote:

version: goz-phase-1-32-g6dc7709 that does not look correct. You should be running v4.0.x, preferably v4.0.4.

โ€” You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cosmos/mainnet/issues/148#issuecomment-788923650, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFL3VZS74DIE7GM7RLPQFLTBTUPPANCNFSM4YOS7ZXA .

alexanderbez commented 3 years ago

The very link you posted states that -> https://hub.cosmos.network/main/gaia-tutorials/join-mainnet.html

git clone -b v4.0.4 https://github.com/cosmos/gaia
make install
okwme commented 3 years ago

i get different errors about not being able to find peers. Is there another list of peers that I should be using?

3:40PM INF Dialing peer address={"id":"3c7cad4154967a294b3ba1cc752e40e8779640ad","ip":"84.201.128.115","port":26656} module=p2p
3:40PM ERR Error dialing seed err="auth failure: secret conn failed: read tcp 10.17.101.0:56714->84.201.128.115:26656: i/o timeout" module=p2p seed={"id":"3c7cad4154967a294b3ba1cc752e40e8779640ad","ip":"84.201.128.115","port":26656}
3:40PM INF Dialing peer address={"id":"bf8328b66dceb4987e5cd94430af66045e59899f","ip":"159.203.104.207","port":26656} module=p2p
3:40PM ERR Error dialing seed err="dial tcp 159.203.104.207:26656: connect: connection refused" module=p2p seed={"id":"bf8328b66dceb4987e5cd94430af66045e59899f","ip":"159.203.104.207","port":26656}
alexanderbez commented 3 years ago

Those are your typical p2p errors. The README has a curated list for hub-4 -> https://hackmd.io/@KFEZk8oMTz6vBlwADz0M4A/BkKEUOsZu

okwme commented 3 years ago

I added all of the seed nodes and "up and ready" addresses to my start command but still get p2p errors after an hour of ABCI Replay Blocks. Here is a log of the errors. Any idea why i'm not able to stay connected?

alexanderbez commented 3 years ago

Unfortunately, it just looks like those peers cannot be reached. We just need better dedicated seeds. Perhaps interchain or AiB should run a few dedicated seed nodes.

okwme commented 3 years ago

we got a chicken and egg problem here lol i could set up a seed node for interchain but i need a seednode first!

On Wed, Mar 3, 2021 at 2:22 PM Aleksandr Bezobchuk notifications@github.com wrote:

Unfortunately, it just looks like those peers cannot be reached. We just need better dedicated seeds. Perhaps interchain or AiB should run a few dedicated seed nodes.

โ€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cosmos/mainnet/issues/148#issuecomment-789710432, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHLLVAMV75VLOLFHEJCWCTTBYZ25ANCNFSM4YOS7ZXA .

-- billyrennekamp.com

alexanderbez commented 3 years ago

True! Surely there must be some persistent peers and seeds we can collect.

@marbar3778 @jackzampolin @zmanian do you have nodes we can connect to?

tac0turtle commented 3 years ago

I have a node. e1fcbe3ba7944468b5aa0ab4449f7ef168eafcb5@159.89.10.97:26657.

Please don't take the node down ๐Ÿ˜„

The errors users are observing can be mitigated by setting their external address. This requires other nodes to do so as well.

zmanian commented 3 years ago

After having followed instructions on the hub https://hub.cosmos.network/main/gaia-tutorials/join-mainnet.html, trying to start gaiad just gives an obscure panic


$ gaiad start
I[2021-03-02|01:54:08.974] starting ABCI with Tendermint                module=main 
panic: JSON encoding of interfaces require non-empty type field.

This can be resolved by gaiad unsafe-reset-all. There is an unclean database state that has been created somehow.

odeke-em commented 3 years ago

Thank you @alexanderbez @okwme @zmanian @marbar3778, I used v4.0.4 and ran through the steps, and now it is alive, much appreciated. I am using @marbar3778's node, and it does still show errors but at least hasn't crashed

$ gaiad start --p2p.seeds "e1fcbe3ba7944468b5aa0ab4449f7ef168eafcb5@159.89.10.97:26657"
1:15AM INF starting ABCI with Tendermint
1:15AM INF Starting multiAppConn service impl={"Logger":{}} module=proxy
1:15AM INF Starting localClient service connection=query impl="marshaling error: json: unsupported type: abcicli.Callback" module=abci-client
1:15AM INF Starting localClient service connection=snapshot impl="marshaling error: json: unsupported type: abcicli.Callback" module=abci-client
1:15AM INF Starting localClient service connection=mempool impl="marshaling error: json: unsupported type: abcicli.Callback" module=abci-client
1:15AM INF Starting localClient service connection=consensus impl="marshaling error: json: unsupported type: abcicli.Callback" module=abci-client
1:15AM INF Starting EventBus service impl={"Logger":{}} module=events
1:15AM INF Starting PubSub service impl={"Logger":{}} module=pubsub
1:15AM INF Starting IndexerService service impl={"Logger":{}} module=txindex
1:15AM INF ABCI Handshake App Info hash= height=0 module=consensus protocol-version=0 software-version=
1:15AM INF ABCI Replay Blocks appHeight=0 module=consensus stateHeight=0 storeHeight=0

this unblocks me a whole lot, and now I can get to my performance Sherlock Holmes business.

alexanderbez commented 3 years ago

๐ŸŽ‰ ๐ŸŽ‰ ๐ŸŽ‰ ๐ŸŽ‰

okwme commented 3 years ago

thanks for the extra seed @marbar3778 I've added to my start command but am still getting p2p errors. I don't understand what you mean by setting external addresses. I've made a new dump of the current error stack here any advice is appreciate, feels like this should be easier and i'd like to understand how i can help make it so.