penumbra-zone / penumbra

Penumbra is a fully private proof-of-stake network and decentralized exchange for the Cosmos ecosystem.
https://penumbra.zone
Apache License 2.0
376 stars 294 forks source link

Failed to start node after PD/Tendermint restart #1577

Closed Archebald-now closed 1 year ago

Archebald-now commented 1 year ago

My node was active with full sync. I always start processes in screen. For tendermint this command is tendermint start --home ~/.penumbra/testnet_data/node0/tendermint For pd - cargo run --bin pd --release -- start --home ~/.penumbra/testnet_data/node0/pd --grpc-port 8081 --metrics-port 9009 I went to the screen pd and pressed ctrl + c. Then I re-entered the command for the pd, and the tendermint (in separate screens). And in the end I got the following logs:

1 For PD:

cargo run --bin pd --release -- start --home ~/.penumbra/testnet_data/node0/pd --grpc-port 8081 --metrics-port 9009
    Finished release [optimized] target(s) in 4.26s
     Running `target/release/pd start --home /root/.penumbra/testnet_data/node0/pd --grpc-port 8081 --metrics-port 9009`
2022-11-02T22:34:04.829347Z  INFO starting pd host="127.0.0.1" abci_port=26658 grpc_port=8081
2022-11-02T22:34:04.830958Z  INFO opening rocksdb path="/root/.penumbra/testnet_data/node0/pd/rocksdb"
2022-11-02T22:34:04.946846Z  INFO consensus::Worker::new: initializing App instance
2022-11-02T22:34:04.962682Z  INFO mempool::Worker::new: initializing App instance
2022-11-02T22:34:04.963415Z  INFO starting ABCI server addr="127.0.0.1:26658"
2022-11-02T22:34:04.963722Z  INFO bound tcp listener local_addr=127.0.0.1:26658
2022-11-02T22:34:06.531475Z  INFO listening for requests
2022-11-02T22:34:06.533794Z  INFO listening for requests
2022-11-02T22:34:06.534021Z  INFO listening for requests
2022-11-02T22:34:06.534497Z  INFO listening for requests
2022-11-02T22:34:06.549842Z  INFO abci:Info: info=Info { version: "v0.34.21", block_version: 11, p2p_version: 8 }

2 For Tenddermint:

tendermint start --home ~/.penumbra/testnet_data/node0/tendermint
I[2022-11-02|23:33:54.526] service start                                module=proxy msg="Starting multiAppConn service" impl=multiAppConn
I[2022-11-02|23:33:54.526] service start                                module=abci-client connection=query msg="Starting socketClient service" impl=socketClient
E[2022-11-02|23:33:54.527] abci.socketClient failed to connect to tcp://127.0.0.1:26658.  Retrying after 3s... module=abci-client connection=query err="dial tcp 127.0.0.1:26658: connect: connection refused"
E[2022-11-02|23:33:57.528] abci.socketClient failed to connect to tcp://127.0.0.1:26658.  Retrying after 3s... module=abci-client connection=query err="dial tcp 127.0.0.1:26658: connect: connection refused"
E[2022-11-02|23:34:00.529] abci.socketClient failed to connect to tcp://127.0.0.1:26658.  Retrying after 3s... module=abci-client connection=query err="dial tcp 127.0.0.1:26658: connect: connection refused"
E[2022-11-02|23:34:03.529] abci.socketClient failed to connect to tcp://127.0.0.1:26658.  Retrying after 3s... module=abci-client connection=query err="dial tcp 127.0.0.1:26658: connect: connection refused"
I[2022-11-02|23:34:06.533] service start                                module=abci-client connection=snapshot msg="Starting socketClient service" impl=socketClient
I[2022-11-02|23:34:06.533] service start                                module=abci-client connection=mempool msg="Starting socketClient service" impl=socketClient
I[2022-11-02|23:34:06.534] service start                                module=abci-client connection=consensus msg="Starting socketClient service" impl=socketClient
I[2022-11-02|23:34:06.534] service start                                module=events msg="Starting EventBus service" impl=EventBus
I[2022-11-02|23:34:06.534] service start                                module=pubsub msg="Starting PubSub service" impl=PubSub
I[2022-11-02|23:34:06.545] service start                                module=txindex msg="Starting IndexerService service" impl=IndexerService
I[2022-11-02|23:34:06.554] ABCI Handshake App Info                      module=consensus height=36310 hash=450F03BDCE55C7374F9B935AD196E4E994005A92626C7F2C69858069BAD7261E software-version=034-aoede protocol-version=1
I[2022-11-02|23:34:06.554] ABCI Replay Blocks                           module=consensus appHeight=36310 storeHeight=36310 stateHeight=36310
I[2022-11-02|23:34:06.554] Completed ABCI Handshake - Tendermint and App are synced module=consensus appHeight=36310 appHash=450F03BDCE55C7374F9B935AD196E4E994005A92626C7F2C69858069BAD7261E
I[2022-11-02|23:34:06.554] Version info                                 module=main tendermint_version=v0.34.21 block=11 p2p=8
I[2022-11-02|23:34:06.554] This node is a validator                     module=consensus addr=7034D647390E11CE3A6E6B1B0A04DD567B8F34C9 pubKey=PubKeyEd25519{F4BF33A5A43977DC56120B48AB49DCB64E930E114B8560E08DA5894B4EE9DCD0}
I[2022-11-02|23:34:06.567] P2P Node ID                                  module=p2p ID=5ce7fad6364f2c8912e6abccefa8eb5bae03b970 file=/root/.penumbra/testnet_data/node0/tendermint/config/node_key.json
I[2022-11-02|23:34:06.567] Adding persistent peers                      module=p2p addrs=[]
I[2022-11-02|23:34:06.567] Adding unconditional peer ids                module=p2p ids=[]
I[2022-11-02|23:34:06.567] Add our address to book                      module=p2p book=/root/.penumbra/testnet_data/node0/tendermint/config/addrbook.json addr=5ce7fad6364f2c8912e6abccefa8eb5bae03b970@0.0.0.0:26656
I[2022-11-02|23:34:06.568] service start                                module=main msg="Starting Node service" impl=Node
I[2022-11-02|23:34:06.568] Starting pprof server                        module=main laddr=:6060
I[2022-11-02|23:34:06.576] service start                                module=p2p msg="Starting P2P Switch service" impl="P2P Switch"
I[2022-11-02|23:34:06.576] service start                                module=blockchain msg="Starting BlockchainReactor service" impl=BlockchainReactor
I[2022-11-02|23:34:06.576] service start                                module=blockchain msg="Starting BlockPool service" impl=BlockPool
I[2022-11-02|23:34:06.576] serve                                        module=rpc-server msg="Starting RPC HTTP server on [::]:26657"
I[2022-11-02|23:34:06.578] service start                                module=consensus msg="Starting Consensus service" impl=ConsensusReactor
I[2022-11-02|23:34:06.578] Reactor                                      module=consensus waitSync=true
I[2022-11-02|23:34:06.578] service start                                module=evidence msg="Starting Evidence service" impl=Evidence
I[2022-11-02|23:34:06.578] service start                                module=statesync msg="Starting StateSync service" impl=StateSync
I[2022-11-02|23:34:06.578] service start                                module=pex msg="Starting PEX service" impl=PEX
I[2022-11-02|23:34:06.578] service start                                module=p2p book=/root/.penumbra/testnet_data/node0/tendermint/config/addrbook.json msg="Starting AddrBook service" impl=AddrBook
I[2022-11-02|23:34:06.627] Saving AddrBook to file                      module=p2p book=/root/.penumbra/testnet_data/node0/tendermint/config/addrbook.json size=1
I[2022-11-02|23:34:06.627] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:34:06.629] Started node                                 module=main nodeInfo="{ProtocolVersion:{P2P:8 Block:11 App:0} DefaultNodeID:5ce7fad6364f2c8912e6abccefa8eb5bae03b970 ListenAddr:tcp://0.0.0.0:26656 Network:penumbra-testnet-aoede Version:v0.34.21 Channels:40202122233038606100 Moniker:archebald Other:{TxIndex:on RPCAddress:tcp://0.0.0.0:26657}}"
I[2022-11-02|23:34:06.629] Dialing peer                                 module=p2p address=32f9aa283d93df60286771e9ff8c2823c08e5db5@94.130.26.9:26357
I[2022-11-02|23:34:36.630] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:34:36.630] Dialing peer                                 module=p2p address=32f9aa283d93df60286771e9ff8c2823c08e5db5@94.130.26.9:26357
I[2022-11-02|23:35:06.630] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:35:06.630] Dialing peer                                 module=p2p address=32f9aa283d93df60286771e9ff8c2823c08e5db5@94.130.26.9:26357
I[2022-11-02|23:35:36.630] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:35:36.630] Dialing peer                                 module=p2p address=32f9aa283d93df60286771e9ff8c2823c08e5db5@94.130.26.9:26357
I[2022-11-02|23:36:06.580] Saving AddrBook to file                      module=p2p book=/root/.penumbra/testnet_data/node0/tendermint/config/addrbook.json size=1
I[2022-11-02|23:36:06.630] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:36:06.630] Dialing peer                                 module=p2p address=32f9aa283d93df60286771e9ff8c2823c08e5db5@94.130.26.9:26357
I[2022-11-02|23:36:36.630] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:37:06.630] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:37:06.631] Dialing peer                                 module=p2p address=32f9aa283d93df60286771e9ff8c2823c08e5db5@94.130.26.9:26357
I[2022-11-02|23:37:36.630] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:37:55.028] Inbound Peer rejected                        module=p2p err="auth failure: handshake failed: EOF" numPeers=0
I[2022-11-02|23:38:06.581] Saving AddrBook to file                      module=p2p book=/root/.penumbra/testnet_data/node0/tendermint/config/addrbook.json size=1
I[2022-11-02|23:38:06.629] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:38:22.779] Inbound Peer rejected                        module=p2p err="auth failure: handshake failed: EOF" numPeers=0
I[2022-11-02|23:38:36.630] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:38:36.630] Dialing peer                                 module=p2p address=32f9aa283d93df60286771e9ff8c2823c08e5db5@94.130.26.9:26357
I[2022-11-02|23:39:06.630] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:39:36.630] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[2022-11-02|23:40:06.580] Saving AddrBook to file                      module=p2p book=/root/.penumbra/testnet_data/node0/tendermint/config/addrbook.json size=1
I[2022-11-02|23:40:06.630] Ensure peers                                 module=pex numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10

image image image


Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

conorsch commented 1 year ago

Howdy, @Archebald-now. I realize we've been slow to respond here, but there's been relevant discussion in the Discord of late, and I recall you've been active there, as well. The most salient log lines in what you've shared appear to me to be:

I[2022-11-02|23:35:36.630] Dialing peer                                 module=p2p address=32f9aa283d93df60286771e9ff8c2823c08e5db5@94.130.26.9:26357
I[2022-11-02|23:36:06.580] Saving AddrBook to file                      module=p2p book=/root/.penumbra/testnet_data/node0/tendermint/config/addrbook.json size=1

Looks like pd is running just fine, but tendermint is having trouble connecting to peers. Please file updates if you experience this with newer testnets, and make sure to name the testnet (e.g. 036-iocaste.2). Similar reports from others are very welcome, too.

conorsch commented 1 year ago

A recent change should help greatly here: testnet join will now pull in many peers from the net_info endpoint, rather than just setting a single peer (https://github.com/penumbra-zone/penumbra/pull/1672). That change will be part of the next testnet.

Archebald-now commented 1 year ago

thanks a lot for the answer, yes that's right, now I solved the problem by adding more peers that are currently active, and everything began to work

вт, 29 лист. 2022, 19:29 користувач Conor Schaefer @.***> пише:

A recent change should help greatly here: testnet join will now pull in many peers from the net_info endpoint, rather than just setting a single peer (#1672 https://github.com/penumbra-zone/penumbra/pull/1672). That change will be part of the next testnet.

— Reply to this email directly, view it on GitHub https://github.com/penumbra-zone/penumbra/issues/1577#issuecomment-1331020666, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOKB3KTREEG3BICRBJ65B2DWKY4PXANCNFSM6AAAAAARVTVGXM . You are receiving this because you were mentioned.Message ID: @.***>

conorsch commented 1 year ago

We've got more changes to the peering logic for testnets tracked in #1847, so closing this issue.