cosmos / testnets

Cosmos Testnets
157 stars 178 forks source link

[testnet] ERR Stopping peer for error #325

Closed 99Kies closed 6 months ago

99Kies commented 1 year ago

I'm trying to run the verifier node of the test network, but I'm having problems synchronizing the data.

I encountered the following error when I started the service with v7.0.2 version.

9:11AM ERR Stopping peer for error err=EOF module=p2p peer={"Data":{},"Logger":{}}
9:11AM INF Stopping Peer service impl={"Data":{},"Logger":{}} module=p2p peer={"id":"3e506472683ceb7ed75c1578d092c79785c27857","ip":"165.232.159.99","port":26656}
9:11AM INF Reconnecting to peer addr={"id":"3e506472683ceb7ed75c1578d092c79785c27857","ip":"165.232.159.99","port":26656} module=p2p
9:11AM INF Dialing peer address={"id":"3e506472683ceb7ed75c1578d092c79785c27857","ip":"165.232.159.99","port":26656} module=p2p
9:11AM INF Stopping queryMaj23Routine for peer module=consensus peer={"Data":{},"Logger":{}}
9:11AM INF Stopping gossipVotesRoutine for peer module=consensus peer={"Data":{},"Logger":{}}
9:11AM INF Stopping gossipDataRoutine for peer module=consensus peer={"Data":{},"Logger":{}}
9:11AM INF Connection is closed @ recvRoutine (likely by the other side) conn={"Logger":{}} module=p2p peer={"id":"639d50339d7045436c756a042906b9a69970913f","ip":"165.232.138.225","port":26656}
9:11AM INF Stopping MConnection service impl={"Logger":{}} module=p2p peer={"id":"639d50339d7045436c756a042906b9a69970913f","ip":"165.232.138.225","port":26656}
9:11AM ERR Stopping peer for error err=EOF module=p2p peer={"Data":{},"Logger":{}}

this is my config:

[statesync]
# State sync rapidly bootstraps a new node by discovering, fetching, and restoring a state machine
# snapshot from peers instead of fetching and replaying historical blocks. Requires some peers in
# the network to take and serve state machine snapshots. State sync is not attempted if the node
# has any local state (LastBlockHeight > 0). The node will have a truncated block history,
# starting from the height of the snapshot.
enable = true

# RPC servers (comma-separated) for light client verification of the synced state machine and
# retrieval of state data for node bootstrapping. Also needs a trusted height and corresponding
# header hash obtained from a trusted source, and a period during which validators can be trusted.
#
# For Cosmos SDK-based chains, trust_period should usually be about 2/3 of the unbonding time (~2
# weeks) during which they can be financially punished (slashed) for misbehavior.
rpc_servers = "rpc.sentry-01.theta-testnet.polypore.xyz:26657,rpc.sentry-02.theta-testnet.polypore.xyz:26657"
trust_height = 13515478
trust_hash = "E694714C5AB4A4899D6C787FA6061E65EBACA80A70894AF6A1F2D8D46258567E"
trust_period = "168h0m0s"

and this is my start command:

gaiad start --x-crisis-skip-assert-invariants

By the way can you help me judge my understanding --- my understanding is that I need to run the node with v6.0.0 first, wait for the sync to reach block 9283650, then shut down the current sync service, switch the version to v7.0.0-rc0 and then restart the service again.

99Kies commented 1 year ago

I started StateSync sync mode.

and this is my config:

persistent_peers = "639d50339d7045436c756a042906b9a69970913f@seed-01.theta-testnet.polypore.xyz:26656,3e506472683ceb7ed75c1578d092c79785c27857@seed-02.theta-testnet.polypore.xyz:26656"

But it seems that the sync is still failing because of the seed node timeout, and the seed I'm using is the one in the documentation. i can't reach anyone else right now, is there an update on the seed now?

this is my start log info:

12:16AM INF Added peer module=p2p peer={"Data":{},"Logger":{}}
12:16AM INF Discovered new snapshot format=1 hash="�\v1\x10�\x1a�^O�\t�\x1b�h�F�D�h`=|v�q\x1ḙ3#" height=13523000 module=statesync
12:16AM INF Discovered new snapshot format=1 hash="��On��&8B�$\r(�\x06��kZ����\\t�B7�R�:" height=13522000 module=statesync
12:16AM INF Discovered new snapshot format=1 hash="\vd\t��c=��F�#��7����a\x042����0\x04\x13�td" height=13521000 module=statesync
12:16AM INF Discovered new snapshot format=1 hash="��Zlw\b�^����1\t\t~�NvHv �5��o\x1f\x1e�Œ" height=13520000 module=statesync
12:16AM INF Discovered new snapshot format=1 hash="\x01\x10��R�\x12�.�K��'|\\�|��Ԉ�r�lYb\x01��?" height=13519000 module=statesync
12:16AM INF Discovered new snapshot format=1 hash="�8PT\x16\u007f�����\x03�i\x1f��\x03�\x15v$���oTخ� `" height=13518000 module=statesync
12:16AM INF Discovered new snapshot format=1 hash="��A�����̄��H�\x10;\x03ʄ��>ت}���\x18�J�" height=13517000 module=statesync
12:16AM INF Discovered new snapshot format=1 hash="[\x02r� e-E�\bi0e˻�mE\x04\x02��D��\f(�\x12�F\x1c" height=13516000 module=statesync
12:16AM INF Discovered new snapshot format=1 hash="%Q\tf����p��\x1cGKAt\vA�@d�c�Z�\n��پ�" height=13515000 module=statesync
12:16AM INF Discovered new snapshot format=1 hash="Dz.:J�\nm>CёF\rp��W\u070e9�?n����\x01�/�" height=13514000 module=statesync
12:16AM ERR dialing failed (attempts: 1): dial tcp 18.215.117.208:26656: i/o timeout addr={"id":"3af09a146c5ad335e364fa9c219fda1565e26f40","ip":"18.215.117.208","port":26656} module=pex
12:16AM ERR dialing failed (attempts: 1): dial tcp 34.205.28.215:26656: i/o timeout addr={"id":"692e8ca585c72a0a6ab705b3e0ff2fec99e7588d","ip":"34.205.28.215","port":26656} module=pex
12:16AM ERR dialing failed (attempts: 1): dial tcp 159.89.125.186:26656: i/o timeout addr={"id":"38537666a5d6bef57713281be25cc5383cc77ac3","ip":"159.89.125.186","port":26656} module=pex
12:16AM ERR dialing failed (attempts: 1): dial tcp 35.194.180.0:40830: i/o timeout addr={"id":"a37c2dffff6a7d8ce6d8e1c548cded72503ca9cd","ip":"35.194.180.0","port":40830} module=pex
12:16AM ERR dialing failed (attempts: 1): dial tcp 18.219.225.144:26656: i/o timeout addr={"id":"3b5068a968edf917a42f2e600a9da0ea9abb64cb","ip":"18.219.225.144","port":26656} module=pex
^[[I^[[O^[[I^[[O^[[I12:16AM ERR failed to restore snapshot err="cannot import into non-IAVL store \"icahost\": internal logic error"
12:16AM INF Applied snapshot chunk to ABCI app chunk=14 format=1 height=13523000 module=statesync total=15
12:16AM ERR State sync failed err="state sync aborted" module=statesync
^[[O^[[I^[[O^[[I^[[O12:16AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=4 numToDial=6
12:16AM INF Will dial address addr={"id":"f7cea52c4fdb175071a138407e91125ab5c1893a","ip":"3.121.126.32","port":34770} module=pex
12:16AM INF Will dial address addr={"id":"e33e3ec79440795fab748888a5ce857c1129c32d","ip":"95.121.233.44","port":26656} module=pex
12:16AM INF Will dial address addr={"id":"626e25e6bca1590e46b5350fb2ea29a4de7939fa","ip":"194.163.180.180","port":26656} module=pex
12:16AM INF Will dial address addr={"id":"0ee88d4bf1fc35ccc389d9368edcdf29e4a93421","ip":"44.197.238.148","port":26656} module=pex
12:16AM INF Will dial address addr={"id":"931fa504874c98b41437ad735506929e77a5e38b","ip":"46.166.146.165","port":26656} module=pex
12:16AM INF Will dial address addr={"id":"ca470e0b62b8466cfdf6273ddde4c1bc5dbc463d","ip":"98.15.8.61","port":41484} module=pex
12:16AM INF Dialing peer address={"id":"ca470e0b62b8466cfdf6273ddde4c1bc5dbc463d","ip":"98.15.8.61","port":41484} module=p2p
12:16AM INF Dialing peer address={"id":"626e25e6bca1590e46b5350fb2ea29a4de7939fa","ip":"194.163.180.180","port":26656} module=p2p
12:16AM INF Dialing peer address={"id":"0ee88d4bf1fc35ccc389d9368edcdf29e4a93421","ip":"44.197.238.148","port":26656} module=p2p
12:16AM INF Dialing peer address={"id":"931fa504874c98b41437ad735506929e77a5e38b","ip":"46.166.146.165","port":26656} module=p2p
12:16AM INF Dialing peer address={"id":"f7cea52c4fdb175071a138407e91125ab5c1893a","ip":"3.121.126.32","port":34770} module=p2p
12:16AM INF Dialing peer address={"id":"e33e3ec79440795fab748888a5ce857c1129c32d","ip":"95.121.233.44","port":26656} module=p2p
12:16AM ERR dialing failed (attempts: 1): dial tcp 95.121.233.44:26656: i/o timeout addr={"id":"e33e3ec79440795fab748888a5ce857c1129c32d","ip":"95.121.233.44","port":26656} module=pex
12:16AM ERR dialing failed (attempts: 1): dial tcp 3.121.126.32:34770: i/o timeout addr={"id":"f7cea52c4fdb175071a138407e91125ab5c1893a","ip":"3.121.126.32","port":34770} module=pex
12:16AM ERR dialing failed (attempts: 1): dial tcp 44.197.238.148:26656: i/o timeout addr={"id":"0ee88d4bf1fc35ccc389d9368edcdf29e4a93421","ip":"44.197.238.148","port":26656} module=pex
12:16AM ERR dialing failed (attempts: 1): dial tcp 194.163.180.180:26656: i/o timeout addr={"id":"626e25e6bca1590e46b5350fb2ea29a4de7939fa","ip":"194.163.180.180","port":26656} module=pex
12:16AM ERR dialing failed (attempts: 1): dial tcp 98.15.8.61:41484: i/o timeout addr={"id":"ca470e0b62b8466cfdf6273ddde4c1bc5dbc463d","ip":"98.15.8.61","port":41484} module=pex
12:16AM ERR dialing failed (attempts: 1): dial tcp 46.166.146.165:26656: i/o timeout addr={"id":"931fa504874c98b41437ad735506929e77a5e38b","ip":"46.166.146.165","port":26656} module=pex
^[[I^[[O^[[I^[[O12:17AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=4 numToDial=6
12:17AM INF Will dial address addr={"id":"f5fe383c6338c14f94319a96813ea77df1ab9060","ip":"35.239.129.49","port":26656} module=pex
12:17AM INF Will dial address addr={"id":"aaf0963725335f3d639258553ede9689f8b48b45","ip":"164.92.160.85","port":26656} module=pex
12:17AM INF Will dial address addr={"id":"cd71d7c8aefef00b7688f5b398a4bd8b4795920d","ip":"54.185.32.140","port":26656} module=pex
12:17AM INF Will dial address addr={"id":"2781cb100b12f53d590b2ec52cbfdba41db43baa","ip":"100.96.12.15","port":26656} module=pex
12:17AM INF Will dial address addr={"id":"ed1e0cf09786e1daec18b5db42eaaee81c4043a1","ip":"145.239.205.232","port":26656} module=pex
12:17AM INF Will dial address addr={"id":"9c3e9ecedf6817c902b58e7f976aca3797df03fb","ip":"51.79.20.221","port":26656} module=pex
12:17AM INF Dialing peer address={"id":"f5fe383c6338c14f94319a96813ea77df1ab9060","ip":"35.239.129.49","port":26656} module=p2p
12:17AM INF Dialing peer address={"id":"2781cb100b12f53d590b2ec52cbfdba41db43baa","ip":"100.96.12.15","port":26656} module=p2p
12:17AM INF Dialing peer address={"id":"aaf0963725335f3d639258553ede9689f8b48b45","ip":"164.92.160.85","port":26656} module=p2p
12:17AM INF Dialing peer address={"id":"ed1e0cf09786e1daec18b5db42eaaee81c4043a1","ip":"145.239.205.232","port":26656} module=p2p
12:17AM INF Dialing peer address={"id":"9c3e9ecedf6817c902b58e7f976aca3797df03fb","ip":"51.79.20.221","port":26656} module=p2p
12:17AM INF Dialing peer address={"id":"cd71d7c8aefef00b7688f5b398a4bd8b4795920d","ip":"54.185.32.140","port":26656} module=p2p
12:17AM INF Starting Peer service impl="Peer{MConn{35.239.129.49:26656} f5fe383c6338c14f94319a96813ea77df1ab9060 out}" module=p2p peer={"id":"f5fe383c6338c14f94319a96813ea77df1ab9060","ip":"35.239.129.49","port":26656}
12:17AM INF Starting MConnection service impl=MConn{35.239.129.49:26656} module=p2p peer={"id":"f5fe383c6338c14f94319a96813ea77df1ab9060","ip":"35.239.129.49","port":26656}
12:17AM INF Added peer module=p2p peer={"Data":{},"Logger":{}}
12:17AM ERR dialing failed (attempts: 1): incompatible: peer is on a different network. Got cosmoshub-4, expected theta-testnet-001 addr={"id":"9c3e9ecedf6817c902b58e7f976aca3797df03fb","ip":"51.79.20.221","port":26656} module=pex
12:17AM ERR dialing failed (attempts: 1): dial tcp 100.96.12.15:26656: i/o timeout addr={"id":"2781cb100b12f53d590b2ec52cbfdba41db43baa","ip":"100.96.12.15","port":26656} module=pex
12:17AM ERR dialing failed (attempts: 1): dial tcp 54.185.32.140:26656: i/o timeout addr={"id":"cd71d7c8aefef00b7688f5b398a4bd8b4795920d","ip":"54.185.32.140","port":26656} module=pex
12:17AM ERR dialing failed (attempts: 1): dial tcp 145.239.205.232:26656: i/o timeout addr={"id":"ed1e0cf09786e1daec18b5db42eaaee81c4043a1","ip":"145.239.205.232","port":26656} module=pex
12:17AM ERR dialing failed (attempts: 1): auth failure: secret conn failed: read tcp 10.0.17.80:40876->164.92.160.85:26656: i/o timeout addr={"id":"aaf0963725335f3d639258553ede9689f8b48b45","ip":"164.92.160.85","port":26656} module=pex
dasanchez commented 1 year ago

Hi! I just tested this script and it took about three minutes to sync up.

I only changed the following lines: export GAIA_BRANCH=v7.1.0 export TRUST_HEIGHT=13563400 export TRUST_HASH="5FC5E6B7C3192ABF867E21EEBD7A15D736204804C9425CDAE30C9A1E2AC9AA21"

By the way can you help me judge my understanding --- my understanding is that I need to run the node with v6.0.0 first, wait for the sync to reach block 9283650, then shut down the current sync service, switch the version to v7.0.0-rc0 and then restart the service again.

If you want to use state sync, you can start with v7.1.0 right away and it will get a recent snapshot to quickly sync up to the current height. If you want to sync to every block since genesis, then yes, you start the node with v6.0.4 and go through the upgrade at height 9283650. You will need to turn off state sync if you want to set up a node this way.

99Kies commented 1 year ago

@dasanchez I found that this script can only be executed under the root account, I am now executing successfully, but I am still curious why the previous command failed, in fact I have tried many versions of gaiad before also including v7.1.0.

99Kies commented 1 year ago

I referenced the tutorial here(https://hub.cosmos.network/main/hub-tutorials/join-testnet.html) and just set the parameters(statesync config, persistent_peers, etc...) in it to the ones in this script.

dasanchez commented 1 year ago

@99Kies the tutorial you referenced has been updated with more accurate information, were you able to join the network?

steve-ng commented 1 year ago

I've tried running the script: https://github.com/cosmos/testnets/blob/master/public/join-public-testnet.sh -- though been getting this error, any insight?

ERR failed to remove witnesses err="no witnesses connected. please reset light client" module=light witnessesToRemove=[0]

ERR Can't verify err="failed to obtain the header at height #14579006: post failed: Post \"https://rpc.state-sync-02.theta-testnet.polypore.xyz:443\": context deadline exceeded" module=light

INF failed to fetch and verify app hash err="failed to obtain the header at height #14579006: post failed: Post \"https://rpc.state-sync-02.theta-testnet.polypore.xyz:443\": context deadline exceeded" module=statesync

i've tried to do a curl command and it seems fine

curl -v https://rpc.state-sync-01.theta-testnet.polypore.xyz/block?height=14579005
dasanchez commented 1 year ago

@steve-ng we upgraded the testnet binaries more than once in the last couple of weeks, it's possible that your machine pulled a state sync header from a node running an old version when you ran the script. Are you able to run the script again now that we have settled on v9.0.0-rc7?

openVietAnh commented 1 year ago

I just ran the join-public-testnet.sh script today with CHAIN_BINARY_URL='https://github.com/cosmos/gaia/releases/download/v9.0.0-rc7/gaiad-v9.0.0-rc7-linux-amd64', still getting this error

Thg 2 28 16:45:54 teko-Vostro-5471 gaiad[66445] 4:45PM INF error from light block request from primary, removing... error="post failed: Post \"https://rpc.state-sync-01.theta-testnet.polypore.xyz:443\": context
deadline exceeded" height=14788309 module=light primary={} 
Thg 2 28 16:45:54 teko-Vostro-5471 gaiad[66445] 4:45PM ERR error on light block request from witness, removing... error="post failed: Post \"https://rpc.state-sync-02.theta-testnet.polypore.xyz:443\": context de
adline exceeded" module=light primary={} 
Thg 2 28 16:45:54 teko-Vostro-5471 gaiad[66445] 4:45PM ERR failed to remove witnesses err="no witnesses connected. please reset light client" module=light witnessesToRemove=[0] 
Thg 2 28 16:45:54 teko-Vostro-5471 gaiad[66445] 4:45PM ERR Can't verify err="failed to obtain the header at height #14788309: post failed: Post \"https://rpc.state-sync-02.theta-testnet.polypore.xyz:443\": conte
xt deadline exceeded" module=light 
Thg 2 28 16:45:54 teko-Vostro-5471 gaiad[66445] 4:45PM INF failed to fetch and verify app hash err="failed to obtain the header at height #14788309: post failed: Post \"https://rpc.state-sync-02.theta-testnet.po
lypore.xyz:443\": context deadline exceeded" module=statesync
dasanchez commented 1 year ago

Hi @vietanhtran2710 ! The ERR error on light block request from witness, removing... error="post failed: Post \"https://rpc.state-sync-02.theta-testnet.polypore.xyz:443\": context deadline exceeded" module=light primary={} and ERR failed to remove witnesses err="no witnesses connected. please reset light client" module=light witnessesToRemove=[0] messages will show up sometimes before a good snapshot is found and the node starts syncing.

You may need to reset the state if you had previously run the script that had v9.0.0-rc6 in it. Can you stop the service with systemctl stop gaiad, delete your ~/.gaia home folder, and run the script again?

openVietAnh commented 1 year ago

I reset the state and ran the script again, after a really long time showing Ensure peers and Saving addrbook, the node is now syncing:

Mar 02 20:47:07 testnet-cosmos cosmovisor[1866]: 8:47PM INF received proposal module=consensus proposal={"Type":32,"block_id":{"hash":"340191716FC0BD85BB74046FD78E1F6914C80F50D3F3F6B98A53FD9CBDB67016","parts":{"hash":"A7D9C5380FF2978D901B2BB36798829E056021082166E70DDCFA74F9428779AA","total":1}},"height":14831454,"pol_round":-1,"round":0,"signature":"+YGk8E+Sk2Dsb330YlijOwtwjKIKiO2rIms1CrsplfyhY+lx525W1vcAaInKLjjdynro+x8R7NLm8wCy4p7TAA==","timestamp":"2023-03-03T01:47:06.735656983Z"}
Mar 02 20:47:07 testnet-cosmos cosmovisor[1866]: 8:47PM INF received complete proposal block hash=340191716FC0BD85BB74046FD78E1F6914C80F50D3F3F6B98A53FD9CBDB67016 height=14831454 module=consensus
Mar 02 20:47:07 testnet-cosmos cosmovisor[1866]: 8:47PM INF finalizing commit of block hash={} height=14831454 module=consensus num_txs=1 root=9FE2E5FF3FBF89B219C533A1F9D8E7496A11E0DC8A57713DF04356D62909FD8D
Mar 02 20:47:07 testnet-cosmos cosmovisor[1866]: 8:47PM INF minted coins from module account amount=58442162uatom from=mint module=x/bank
Mar 02 20:47:07 testnet-cosmos cosmovisor[1866]: 8:47PM INF packet sent dst_channel=channel-105 dst_port=transfer module=x/ibc/channel sequence=1553 src_channel=channel-667 src_port=transfer
Mar 02 20:47:07 testnet-cosmos cosmovisor[1866]: 8:47PM INF IBC fungible token transfer amount=1092914 module=x/ibc-transfer receiver=persistence1524rhdvua2h90d8u2kxcgwp9r9vdy2gh9f4uzr sender=cosmos1524rhdvua2h90d8u2kxcgwp9r9vdy2ght9n0v8 token=uatom
Mar 02 20:47:07 testnet-cosmos cosmovisor[1866]: 8:47PM INF executed block height=14831454 module=state num_invalid_txs=0 num_valid_txs=1
Mar 02 20:47:07 testnet-cosmos cosmovisor[1866]: 8:47PM INF commit synced commit=436F6D6D697449447B5B32343020313931203932203736203820333320313039203939203136332039203335203735203233203139382031373120313220313520333020313332203131203134312031363220323230203233312031323020323620313834203137203520313639203436203234325D3A4532344635457D
Mar 02 20:47:07 testnet-cosmos cosmovisor[1866]: 8:47PM INF committed state app_hash=F0BF5C4C08216D63A309234B17C6AB0C0F1E840B8DA2DCE7781AB81105A92EF2 height=14831454 module=state num_txs=1
Mar 02 20:47:07 testnet-cosmos cosmovisor[1866]: 8:47PM INF indexed block exents height=14831454 module=txindex
Mar 02 20:47:12 testnet-cosmos cosmovisor[1866]: 8:47PM INF Timed out dur=4971.063479 height=14831455 module=consensus round=0 step=1
Mar 02 20:47:12 testnet-cosmos cosmovisor[1866]: 8:47PM INF client state updated client-id=07-tendermint-1037 height=1-10410816 module=x/ibc/client
Mar 02 20:47:12 testnet-cosmos cosmovisor[1866]: 8:47PM INF packet acknowledged dst_channel=channel-105 dst_port=transfer module=x/ibc/channel sequence=1551 src_channel=channel-667 src_port=transfer
Mar 02 20:47:12 testnet-cosmos cosmovisor[1866]: 8:47PM INF received proposal module=consensus proposal={"Type":32,"block_id":{"hash":"0D7470C6E89BA16E307126A6A0F3DA09DCA7233CA698C52A695EAF8D7B1DBC44","parts":{"hash":"15489F82C3140E4BB82B5DB32640327DB162F33CF0C0E13E6A023CA33B4B45E6","total":1}},"height":14831455,"pol_round":-1,"round":0,"signature":"EF+RHw6m/7Inyiz3ic51VHKAqih/tF4glHl0IJAOhLwzgFWBMHAn3L736eDEzt1/WnislzyzI10Lc5o/wO1kAg==","timestamp":"2023-03-03T01:47:12.243542157Z"}
Mar 02 20:47:12 testnet-cosmos cosmovisor[1866]: 8:47PM INF received complete proposal block hash=0D7470C6E89BA16E307126A6A0F3DA09DCA7233CA698C52A695EAF8D7B1DBC44 height=14831455 module=consensus
Mar 02 20:47:13 testnet-cosmos cosmovisor[1866]: 8:47PM INF finalizing commit of block hash={} height=14831455 module=consensus num_txs=0 root=F0BF5C4C08216D63A309234B17C6AB0C0F1E840B8DA2DCE7781AB81105A92EF2
Mar 02 20:47:13 testnet-cosmos cosmovisor[1866]: 8:47PM INF minted coins from module account amount=58442164uatom from=mint module=x/bank
Mar 02 20:47:13 testnet-cosmos cosmovisor[1866]: 8:47PM INF executed block height=14831455 module=state num_invalid_txs=0 num_valid_txs=0

Thanks a lot for your help @dasanchez, really helpful and I really appreciate it. May I ask why Ensure peers took really long? For almost half a day the node only show INF Ensure peers module=pex numDialing=0 num InPeers=0 numOutPeers=4 numToDial=6 with numOutPeers and numToDial changing every few hours

dasanchez commented 1 year ago

@vietanhtran2710 that's great to hear! What are your machine specs? The node may have been busy parsing the genesis file, but half a day sounds like a lot.

openVietAnh commented 1 year ago

My Testnet server has 2vCPU with 8GB RAM, and my Mainnet Server has 4 vCPU, 16GB of RAM and 2TB of HDD. The Mainnet Node log has been showing

10:38AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=10 numToDial=0
10:39AM INF Saving AddrBook to file book=/var/local/mainnet-cosmos/config/addrbook.json module=p2p size=4712

for almost 3 days now and nothing else. Both servers are using state sync, does it have to use the genesis file in this mode?

dasanchez commented 1 year ago

@vietanhtran2710 Thanks for sharing! Given the current size of the testnet, we recommend at least 16GB RAM. For joining mainnet, you could try using a snapshot. Polkachu offers mainnet snapshots and a state sync endpoint. We always have the genesis file for the relevant network in place when starting a node. For mainnet, you can find the genesis file url here.

yuanzd123 commented 1 year ago

Hello @dasanchez , why I cannot use state sync to sync within 10 minutes? I am stuck in the log I provided below for a day. It seems like it rolls back from numToDial 3 to 10. Is it behave normally?

I use a Macbook Pro with m1 Max, 32GB, 1TB SSD to run testnet gaiad (v9.1.0). Since I am using Mac, I cannot use systemctl. Instead, I use gaiad start --x-crisis-skip-assert-invariants to start my node.

My config.toml:

[statesync] enable = true rpc_servers = "https://rpc.state-sync-01.theta-testnet.polypore.xyz:443,https://rpc.state-sync-02.theta-testnet.polypore.xyz:443" trust_height = 15989044 trust_hash = "D7B35A3D5D266E4F40555A5F82E47448135256A7E607F52D54BB1FA43F57B39D" trust_period = "168h0m0s"

seeds = "639d50339d7045436c756a042906b9a69970913f@seed-01.theta-testnet.polypore.xyz:26656,3e506472683ceb7ed75c1578d092c79785c27857@seed-02.theta-testnet.polypore.xyz:26656" persistent_peers = ""

6:42AM INF Saving AddrBook to file book=/Users/zhouyuan/.gaia/config/addrbook.json module=p2p size=1003 6:42AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=7 numToDial=3 6:43AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=7 numToDial=3 6:43AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=7 numToDial=3 6:44AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=7 numToDial=3 6:44AM INF Saving AddrBook to file book=/Users/zhouyuan/.gaia/config/addrbook.json module=p2p size=1003 6:44AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=7 numToDial=3 6:45AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=7 numToDial=3 6:45AM INF Connection is closed @ recvRoutine (likely by the other side) conn={"Logger":{}} module=p2p peer={"id":"39de2434a983feb64b6b3c4886d1959334f90591","ip":"82.100.58.103","port":26656} 6:45AM INF service stop impl={"Logger":{}} module=p2p msg={} peer={"id":"39de2434a983feb64b6b3c4886d1959334f90591","ip":"82.100.58.103","port":26656} 6:45AM ERR Stopping peer for error err=EOF module=p2p peer={"Data":{},"Logger":{}} 6:45AM INF service stop impl={"Data":{},"Logger":{}} module=p2p msg={} peer={"id":"39de2434a983feb64b6b3c4886d1959334f90591","ip":"82.100.58.103","port":26656} 6:45AM ERR Connection failed @ sendRoutine conn={"Logger":{}} err="pong timeout" module=p2p peer={"id":"58c2741e699d44c866802a2bf45fe5f621acf36b","ip":"35.239.129.49","port":26656} 6:45AM INF service stop impl={"Logger":{}} module=p2p msg={} peer={"id":"58c2741e699d44c866802a2bf45fe5f621acf36b","ip":"35.239.129.49","port":26656} 6:45AM ERR Stopping peer for error err="pong timeout" module=p2p peer={"Data":{},"Logger":{}} 6:45AM INF service stop impl={"Data":{},"Logger":{}} module=p2p msg={} peer={"id":"58c2741e699d44c866802a2bf45fe5f621acf36b","ip":"35.239.129.49","port":26656} 6:45AM INF Connection is closed @ recvRoutine (likely by the other side) conn={"Logger":{}} module=p2p peer={"id":"4f1626568572dd6be6b3b5478acac01376fc4729","ip":"142.132.241.67","port":26110} 6:45AM INF service stop impl={"Logger":{}} module=p2p msg={} peer={"id":"4f1626568572dd6be6b3b5478acac01376fc4729","ip":"142.132.241.67","port":26110} 6:45AM ERR Stopping peer for error err=EOF module=p2p peer={"Data":{},"Logger":{}} 6:45AM INF service stop impl={"Data":{},"Logger":{}} module=p2p msg={} peer={"id":"4f1626568572dd6be6b3b5478acac01376fc4729","ip":"142.132.241.67","port":26110} 6:45AM ERR Connection failed @ sendRoutine conn={"Logger":{}} err="pong timeout" module=p2p peer={"id":"33405c22f56348c557be86c46ce5aafedaa30f40","ip":"95.217.172.214","port":26110} 6:45AM INF service stop impl={"Logger":{}} module=p2p msg={} peer={"id":"33405c22f56348c557be86c46ce5aafedaa30f40","ip":"95.217.172.214","port":26110} 6:45AM ERR Stopping peer for error err="pong timeout" module=p2p peer={"Data":{},"Logger":{}} 6:45AM INF service stop impl={"Data":{},"Logger":{}} module=p2p msg={} peer={"id":"33405c22f56348c557be86c46ce5aafedaa30f40","ip":"95.217.172.214","port":26110} 6:45AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=3 numToDial=7 6:45AM ERR Connection failed @ sendRoutine conn={"Logger":{}} err="pong timeout" module=p2p peer={"id":"3e506472683ceb7ed75c1578d092c79785c27857","ip":"104.245.147.203","port":26656} 6:45AM INF service stop impl={"Logger":{}} module=p2p msg={} peer={"id":"3e506472683ceb7ed75c1578d092c79785c27857","ip":"104.245.147.203","port":26656} 6:45AM ERR Stopping peer for error err="pong timeout" module=p2p peer={"Data":{},"Logger":{}} 6:45AM INF service stop impl={"Data":{},"Logger":{}} module=p2p msg={} peer={"id":"3e506472683ceb7ed75c1578d092c79785c27857","ip":"104.245.147.203","port":26656} 6:45AM INF Connection is closed @ recvRoutine (likely by the other side) conn={"Logger":{}} module=p2p peer={"id":"f28e8ebc62a9db90b7b021090f234f361d4822a9","ip":"89.149.218.150","port":26656} 6:45AM INF service stop impl={"Logger":{}} module=p2p msg={} peer={"id":"f28e8ebc62a9db90b7b021090f234f361d4822a9","ip":"89.149.218.150","port":26656} 6:45AM ERR Stopping peer for error err=EOF module=p2p peer={"Data":{},"Logger":{}} 6:45AM INF service stop impl={"Data":{},"Logger":{}} module=p2p msg={} peer={"id":"f28e8ebc62a9db90b7b021090f234f361d4822a9","ip":"89.149.218.150","port":26656} 6:45AM ERR Connection failed @ sendRoutine conn={"Logger":{}} err="pong timeout" module=p2p peer={"id":"fc87b38b4e63332fe55cd9a185180a2717e073fc","ip":"167.235.105.28","port":26110} 6:45AM INF service stop impl={"Logger":{}} module=p2p msg={} peer={"id":"fc87b38b4e63332fe55cd9a185180a2717e073fc","ip":"167.235.105.28","port":26110} 6:45AM ERR Stopping peer for error err="pong timeout" module=p2p peer={"Data":{},"Logger":{}} 6:45AM INF service stop impl={"Data":{},"Logger":{}} module=p2p msg={} peer={"id":"fc87b38b4e63332fe55cd9a185180a2717e073fc","ip":"167.235.105.28","port":26110} 6:46AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=0 numToDial=10 6:46AM INF Saving AddrBook to file book=/Users/zhouyuan/.gaia/config/addrbook.json module=p2p size=1003 6:46AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=0 numToDial=10 6:47AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=0 numToDial=10 6:47AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=0 numToDial=10 6:48AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=0 numToDial=10 6:48AM INF Saving AddrBook to file book=/Users/zhouyuan/.gaia/config/addrbook.json module=p2p size=1003 6:48AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=0 numToDial=10 6:49AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=0 numToDial=10 6:49AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=0 numToDial=10 6:50AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=0 numToDial=10 6:50AM INF service start impl="Peer{MConn{142.132.241.67:26110} 4f1626568572dd6be6b3b5478acac01376fc4729 out}" module=p2p msg={} peer={"id":"4f1626568572dd6be6b3b5478acac01376fc4729","ip":"142.132.241.67","port":26110} 6:50AM INF service start impl=MConn{142.132.241.67:26110} module=p2p msg={} peer={"id":"4f1626568572dd6be6b3b5478acac01376fc4729","ip":"142.132.241.67","port":26110} 6:50AM INF Saving AddrBook to file book=/Users/zhouyuan/.gaia/config/addrbook.json module=p2p size=1003 6:50AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=1 numToDial=9 6:51AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=1 numToDial=9 6:51AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=1 numToDial=9 6:52AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=1 numToDial=9 6:52AM INF service start impl="Peer{MConn{104.245.147.203:26656} 3e506472683ceb7ed75c1578d092c79785c27857 out}" module=p2p msg={} peer={"id":"3e506472683ceb7ed75c1578d092c79785c27857","ip":"104.245.147.203","port":26656} 6:52AM INF service start impl=MConn{104.245.147.203:26656} module=p2p msg={} peer={"id":"3e506472683ceb7ed75c1578d092c79785c27857","ip":"104.245.147.203","port":26656} 6:52AM INF Saving AddrBook to file book=/Users/zhouyuan/.gaia/config/addrbook.json module=p2p size=1003 6:52AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=2 numToDial=8 6:53AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=2 numToDial=8 6:53AM INF service start impl="Peer{MConn{89.149.218.150:26656} f28e8ebc62a9db90b7b021090f234f361d4822a9 out}" module=p2p msg={} peer={"id":"f28e8ebc62a9db90b7b021090f234f361d4822a9","ip":"89.149.218.150","port":26656} 6:53AM INF service start impl=MConn{89.149.218.150:26656} module=p2p msg={} peer={"id":"f28e8ebc62a9db90b7b021090f234f361d4822a9","ip":"89.149.218.150","port":26656} 6:53AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=3 numToDial=7 6:54AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=3 numToDial=7 6:54AM INF Saving AddrBook to file book=/Users/zhouyuan/.gaia/config/addrbook.json module=p2p size=1003 6:54AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=3 numToDial=7 6:55AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=3 numToDial=7 6:55AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=3 numToDial=7 6:56AM INF Ensure peers module=pex numDialing=0 numInPeers=0 numOutPeers=3 numToDial=7 6:56AM INF Saving AddrBook to file book=/Users/zhouyuan/.gaia/config/addrbook.json module=p2p size=1003

dasanchez commented 1 year ago

Hi @yuanzd123 ! Could you try syncing with Gaia v9.0.3, please? We haven't upgraded the theta-testnet-001 chain to v9.1.0 yet.

yuanzd123 commented 1 year ago

@dasanchez Thanks! It worked! If I want to setup a sentry node, use basically the same setting as my local node in the AWS EC2 instance, and only need to change my local node and cloud node config as this. Start both nodes, that's it right?

image
mechul-eth commented 1 year ago

Could you please try these ?

sudo systemctl stop gaiad

PEERS=e281bdf052dad68ccf40777cb7d25649a5b9fa26@207.180.219.160:36656,ef515ee185ed2ae6cfb012da83431420273c53f9@136.243.88.91:2320,a2491114d865ecf0a29f46cec3c3c9c056979e83@194.163.159.174:26656,4743cf09e278aa311f8cc282804d788c55288fa5@3.144.162.101:26656,8e0a92711c0233fa74caa36488e3ff22ea6dcb11@74.118.136.163:26656,2c08aa7bc9e94304225ada5ddc30374f00942a90@138.201.204.5:43656,ad5c0ab231f9b0ed91ffbaee70fd082fd5e78ad4@65.109.85.225:2010,29bc3833f3584eb795fc28653021cfa25d9bb9c6@82.100.58.103:30156,3f31c6038c69737d90f938e98438d20a5a3c0e03@5.75.245.174:26656,a86f0c6f503b728cbd48218462dbee10d1ebea85@3.76.85.22:26656,078bb821afec3a3374c1a26c466532b30f66190a@65.109.91.165:26656,f74e384e48bb78d566297eb502f8059798bfe2e5@135.181.16.163:26001,08ec17e86dac67b9da70deb20177655495a55407@147.182.145.105:26656,794fcb57bb76c50515f31dc8e0e8d6536dea859d@178.239.197.182:26656,d1752a3dcfc9d3169c47853a82fe0d1ec79c0024@147.182.145.100:26656,2119a889318d668e798b74346db6af760405198e@89.116.31.184:26656,d13d77428697308eacb1a6a33b42f72650bc511e@80.64.208.139:26656,abb2ddadc12f9135209d1dd03b3707f649ecbb7a@147.182.145.88:26656,328e0627172add338f6aed08600098a9308dc52d@147.182.145.103:26656,f3208a4fc74c9f7326faabf2551d93d6e1fe69c2@222.106.187.13:40800,08cf0f37ca069d9f4027b0b6cb406c40c9fabb16@51.91.118.140:26656,359d63178736911e3e4c716f2491cafaa687351a@34.168.147.251:26656,c32a8a97ded5c536f981cf922592f9a0ccf89e84@51.159.223.25:26656,a64c012c9312dec9b360400321f7377d1ad42987@95.216.137.166:26656,3d2516052fd8b134428971d1218a149bba6e44be@34.83.139.196:26656,233598946a15427b9541376e7cfc30dab07c4327@34.168.95.243:26656,257665212d93cfe4a354eb78d500137995ec5e3f@95.217.144.107:14956,032ac421764cdf5139e64510669cc519fe1e1193@37.120.245.83:26656,9b44ecbab529ae70cb053743f229c7e0b7cd4917@63.250.53.161:26656,113e7de8adadf968fe2e1c67ea839fe378e176bf@142.132.203.60:26656,f480e395153941122a0906e4d2158f722b6d11f4@65.109.108.47:14956,99ad87e4419cbea7c59b27e77442a457eda1dc21@65.21.202.61:25007,b7d0bd260fca7a5a19c7631b15f6068891faa60e@143.198.45.216:25001,410f97757363c3e9bd5c39e32e05edf62a718cf3@65.109.86.49:26656,cd1cd8d95132857ae14825428e55eaffea36a597@195.14.6.2:26656,f2520026fb9086f1b2f09e132d209cbe88064ec1@146.190.161.210:25003,a2cfd24ca641a6d407b03d98595f4755b349df61@141.94.138.48:26676,49d75c6094c006b6f2758e45457c1f3d6002ce7a@167.172.155.44:25002,dab60f8d7f66e9a4c733c0bcfe145cfca6e331c2@209.182.237.121:28656,ab6375b396b09e3648b4d12c4b2b1aefdd8f4d4a@185.252.220.89:26656,328a00ee256b3219e018a33b6cc124bc8b44249a@89.58.32.218:26656,4ea6e56300a2f37b90e58de5ee27d1c9065cf871@146.190.252.36:26656,e121e8bed6710d05e45f5a2ddf88107b360281c5@142.132.156.189:26656,5a4475fe23124a5cabd13d27ce14eccedb2ba1b5@141.95.103.138:26656 sed -i.bak -e "s/^persistent_peers *=.*/persistent_peers = \"$PEERS\"/" $HOME/.gaia/config/config.toml

gaiad tendermint unsafe-reset-all --home ~/.HOME --keep-addr-book

sudo systemctl restart gaiad && journalctl -fu gaiad -o cat