telosnetwork / leap

C++ implementation of the Antelope protocol
Other
0 stars 0 forks source link

Node stops if subst manifest task encounters a network fault #31

Open guilledk opened 3 months ago

guilledk commented 3 months ago

When the background substitution manifest remote updater task fails currenly the node is completely stopped (gracefully), here are logs from such event:

info  2024-05-19T01:51:18.474 net-2     net_plugin.cpp:4710           connection_monitor   ] p2p client connections: 0/25, peer connections: 2/2, block producer peers: 0
info  2024-05-19T01:51:18.480 nodeos    producer_plugin.cpp:750       on_incoming_block    ] Received block 374febec5bb58f18... #342223363 @ 2024-05-19T01:51:18.500 signed by persiantelos [trxs: 0, lib: 342223037, confirmed: 0, net: 0, cpu: 100, elapsed: 148, time: 569, latency: -19 ms]
info  2024-05-19T01:51:19.000 nodeos    producer_plugin.cpp:750       on_incoming_block    ] Received block 8f2ccdc42ab61922... #342223364 @ 2024-05-19T01:51:19.000 signed by persiantelos [trxs: 1, lib: 342223037, confirmed: 0, net: 168, cpu: 200, elapsed: 193, time: 723, latency: 0 ms]
info  2024-05-19T01:51:19.481 nodeos    producer_plugin.cpp:750       on_incoming_block    ] Received block a1c14b41642a7c3e... #342223365 @ 2024-05-19T01:51:19.500 signed by persiantelos [trxs: 0, lib: 342223037, confirmed: 0, net: 0, cpu: 100, elapsed: 172, time: 578, latency: -18 ms]
info  2024-05-19T01:51:19.828 nodeos    subst_plugin.cpp:134          operator()           ] trigger manifest update
info  2024-05-19T01:51:19.828 nodeos    substitution_context.c:321    fetch_manifest       ] fetching manifest at http://evmwasms.s3.amazonaws.com/subst.json
warn  2024-05-19T01:51:23.714 net-2     net_plugin.cpp:3520           handle_message       ] ["kbp-fullship:9879 - aaf509b" - 1 141.193.240.11:9879] Clock offset is 853028us, calculation: (rec 1716083483708627000 - org 1716083481997175000 + xmt 1716083483708627000 - dst 1716083483714023000)/2
info  2024-05-19T01:51:36.267 net-1     net_plugin.cpp:2978           operator()           ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] Peer closed connection
error 2024-05-19T01:51:36.267 net-1     net_plugin.cpp:3006           operator()           ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] Closing connection
info  2024-05-19T01:51:36.267 net-1     net_plugin.cpp:1461           _close               ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] closing
info  2024-05-19T01:51:36.368 net-2     net_plugin.cpp:4710           connection_monitor   ] p2p client connections: 0/25, peer connections: 2/2, block producer peers: 0
info  2024-05-19T01:51:36.412 net-1     net_plugin.cpp:1583           operator()           ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] Sending handshake generation 1, lib 342223037, head 342223365, id a1c14b41642a7c3e
info  2024-05-19T01:51:36.460 net-2     net_plugin.cpp:2405           sync_recv_notice     ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] notice_message, pending 342223398, blk_num 342223398, id cef3f047e53fca9b...
info  2024-05-19T01:51:36.460 net-2     net_plugin.cpp:2369           verify_catchup       ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] catch_up while in in sync, fork head num = 342223398 target LIB = 342082433 next_expected = 342082434, id cef3f047e53fca9b...
info  2024-05-19T01:51:36.460 net-2     net_plugin.cpp:2035           set_state            ] old state in sync becoming head catchup
info  2024-05-19T01:51:36.460 net-2     net_plugin.cpp:3374           handle_message       ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] Local network version: 7
info  2024-05-19T01:51:36.465 net-2     net_plugin.cpp:2309           recv_handshake       ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] handshake lib 342223073, head 342223398, head id cef3f047e53fca9b.. sync 3, head 342223365, lib 342223037
info  2024-05-19T01:51:36.465 net-2     net_plugin.cpp:2384           verify_catchup       ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] none notice while in head catchup, fork head num = 342223398, id cef3f047e53fca9b...
info  2024-05-19T01:51:36.509 net-0     net_plugin.cpp:2035           set_state            ] old state head catchup becoming in sync
info  2024-05-19T01:51:36.509 net-0     net_plugin.cpp:1583           operator()           ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] Sending handshake generation 2, lib 342223037, head 342223365, id a1c14b41642a7c3e
info  2024-05-19T01:51:36.509 net-3     net_plugin.cpp:1583           operator()           ] ["kbp-fullship:9879 - aaf509b" - 1 141.193.240.11:9879] Sending handshake generation 9, lib 342223037, head 342223365, id a1c14b41642a7c3e
info  2024-05-19T01:51:36.522 net-3     net_plugin.cpp:2405           sync_recv_notice     ] ["kbp-fullship:9879 - aaf509b" - 1 141.193.240.11:9879] notice_message, pending 342223399, blk_num 342223399, id c77e76d7ea70987b...
info  2024-05-19T01:51:36.523 net-3     net_plugin.cpp:2369           verify_catchup       ] ["kbp-fullship:9879 - aaf509b" - 1 141.193.240.11:9879] catch_up while in in sync, fork head num = 342223399 target LIB = 342223073 next_expected = 342082434, id c77e76d7ea70987b...
info  2024-05-19T01:51:36.523 net-3     net_plugin.cpp:2035           set_state            ] old state in sync becoming head catchup
info  2024-05-19T01:51:36.558 net-1     net_plugin.cpp:2405           sync_recv_notice     ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] notice_message, pending 342223399, blk_num 342223399, id c77e76d7ea70987b...
info  2024-05-19T01:51:36.558 net-1     net_plugin.cpp:2384           verify_catchup       ] ["telos.p2p.eosusa.io:9876 - 223e35e" - 2 35.131.184.46:9876] none notice while in head catchup, fork head num = 342223399, id c77e76d7ea70987b...
info  2024-05-19T01:52:06.368 net-1     net_plugin.cpp:4710           connection_monitor   ] p2p client connections: 0/25, peer connections: 2/2, block producer peers: 0
info  2024-05-19T01:52:36.369 net-1     net_plugin.cpp:4710           connection_monitor   ] p2p client connections: 0/25, peer connections: 2/2, block producer peers: 0
info  2024-05-19T01:53:06.369 net-2     net_plugin.cpp:4710           connection_monitor   ] p2p client connections: 0/25, peer connections: 2/2, block producer peers: 0
info  2024-05-19T01:53:36.370 net-2     net_plugin.cpp:4710           connection_monitor   ] p2p client connections: 0/25, peer connections: 2/2, block producer peers: 0
Caught application loop exception: "Assert Exception"
info  2024-05-19T01:53:45.467 nodeos    resource_monitor_plugi:117    plugin_shutdown      ] entered shutdown...
info  2024-05-19T01:53:45.467 nodeos    resource_monitor_plugi:119    plugin_shutdown      ] exiting shutdown
info  2024-05-19T01:53:45.469 nodeos    producer_plugin.cpp:1411      plugin_shutdown      ] exit shutdown
info  2024-05-19T01:53:45.469 nodeos    net_plugin.cpp:4382           plugin_shutdown      ] shutdown..
info  2024-05-19T01:53:45.469 nodeos    net_plugin.cpp:4570           close_all            ] close all 2 connections
info  2024-05-19T01:53:45.470 nodeos    net_plugin.cpp:4386           plugin_shutdown      ] exit shutdown
info  2024-05-19T01:53:45.576 nodeos    http_plugin.cpp:515           plugin_shutdown      ] exit shutdown
info  2024-05-19T01:53:45.904 nodeos    main.cpp:155                  operator()           ] nodeos version v5.0.2 v5.0.2-f0d76b76cd732ee1e00e1161c9ebbae51bbb8a7c-dirty
info  2024-05-19T01:53:46.149 nodeos    main.cpp:62                   log_non_default_opti ] Non-default options: disable-replay-opts, data-dir = /data/telosevm/data/nodeosv5_data, config-dir = /data/telosevm/nodeosv5Sub, http-server-address = 0.0.0.0:8889, p2p-listen-endpoint = 0.0.0.0:9878, agent-name = "sub5.0.1-mainnet", wasm-runtime = eos-vm-jit, eos-vm-oc-compile-threads = 4, eos-vm-oc-enable = true, read-only-read-window-time-us = 2000000, enable-account-queries = true, chain-state-db-size-mb = 65536, contracts-console = true, access-control-allow-origin = *, access-control-allow-headers = *, verbose-http-errors = true, http-validate-host = false, abi-serializer-max-time-ms = 20000, http-max-response-time-ms = 10000, max-transaction-time = 499, p2p-max-nodes-per-host = 100, plugin = eosio::http_plugin, plugin = eosio::chain_plugin, plugin = eosio::chain_api_plugin, plugin = eosio::subst_plugin, subst-manifest = http://evmwasms.s3.amazonaws.com/subst.json, plugin = eosio::net_plugin, plugin = eosio::producer_plugin, plugin = eosio::state_history_plugin, state-history-endpoint = 0.0.0.0:19001, trace-history = true, chain-state-history = true, trace-history-debug-mode = true, state-history-dir = state-history, p2p-peer-address = telos.p2p.eosusa.io:9876, p2p-peer-address = 141.193.240.11:9879
error 2024-05-19T01:53:46.150 nodeos    main.cpp:209                  main                 ] 10 assert_exception: Assert Exception
!ec: Failed to connect: Connection timed out
    {"message":"Connection timed out"}
    nodeos  http_client.cpp:166 create_raw_connection

We want to make it so that it will ignore network errors and retry in next interval, cause most of this time this are random/intermitent/temporal faliures.