casper-network / casper-node

Reference client for CASPER protocol
https://casper.network
Apache License 2.0
393 stars 223 forks source link

Key issue while joining delta testnet #495

Closed tmesser closed 3 years ago

tmesser commented 4 years ago

While attempting to join the delta testnet, I provided public hex key 01329cc15cb92fca5d5ce6d7fcc881010161547f452e2caad4dff565708b00b8c7, which was successfully funded. Upon launching the node and attempting to bond, I received some warnings of this form:

Nov  5 14:51:39 casper-node casper-node[7048]: {"timestamp":"Nov 05 14:51:39.169","level":"INFO","fields":{"message":"9be8..2d7b: connection closed","peer_id":"2725..4368","peer_address":"135.181.73.218:34858"},"target":"casper_node::components::small_network"}
Nov  5 14:51:39 casper-node casper-node[7048]: {"timestamp":"Nov 05 14:51:39.169","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection closed","peer_id":"2725..4368","peer_address":"135.181.73.218:34553"},"target":"casper_node::components::small_network"}
Nov  5 14:53:49 casper-node casper-node[7048]: {"timestamp":"Nov 05 14:53:49.348","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection failed","peer_id":"725c..21c8","peer_address":"135.181.73.218:34553","err":"failed to send message"},"target":"casper_node::components::small_network"}
Nov  5 14:57:36 casper-node casper-node[7048]: {"timestamp":"Nov 05 14:57:36.031","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection failed","peer_id":"99c4..8c92","peer_address":"135.181.73.218:34553","err":"failed to send message"},"target":"casper_node::components::small_network"}
Nov  5 15:01:36 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:01:36.250","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection failed","peer_id":"4b5b..adef","peer_address":"135.181.73.218:34553","err":"failed to send message"},"target":"casper_node::components::small_network"}
Nov  5 15:06:30 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:06:30.501","level":"WARN","fields":{"message":"9be8..2d7b: receiving message failed, closing connection","err":"Connection reset by peer (os error 104)","peer_id":"4712..58c3"},"target":"casper_node::components::small_network"}
Nov  5 15:06:30 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:06:30.501","level":"INFO","fields":{"message":"9be8..2d7b: connection closed","peer_id":"4712..58c3","peer_address":"135.181.73.218:43550"},"target":"casper_node::components::small_network"}
Nov  5 15:06:30 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:06:30.501","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection closed","peer_id":"4712..58c3","peer_address":"135.181.73.218:34553"},"target":"casper_node::components::small_network"}
Nov  5 15:11:32 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:11:32.363","level":"WARN","fields":{"message":"9be8..2d7b: receiving message failed, closing connection","err":"Connection reset by peer (os error 104)","peer_id":"6e1a..8a85"},"target":"casper_node::components::small_network"}
Nov  5 15:11:32 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:11:32.363","level":"INFO","fields":{"message":"9be8..2d7b: connection closed","peer_id":"6e1a..8a85","peer_address":"164.68.114.22:40472"},"target":"casper_node::components::small_network"}
Nov  5 15:11:32 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:11:32.364","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection closed","peer_id":"6e1a..8a85","peer_address":"164.68.114.22:34553"},"target":"casper_node::components::small_network"}
Nov  5 15:13:42 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:13:42.144","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection failed","peer_id":"b56a..6ef1","peer_address":"164.68.114.22:34553","err":"failed to send message"},"target":"casper_node::components::small_network"}
Nov  5 15:15:42 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:15:42.292","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection failed","peer_id":"f9ce..270e","peer_address":"164.68.114.22:34553","err":"failed to send message"},"target":"casper_node::components::small_network"}
Nov  5 15:16:15 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:16:15.752","level":"WARN","fields":{"message":"9be8..2d7b: receiving message failed, closing connection","err":"Connection reset by peer (os error 104)","peer_id":"3937..faf0"},"target":"casper_node::components::small_network"}
Nov  5 15:16:15 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:16:15.753","level":"INFO","fields":{"message":"9be8..2d7b: connection closed","peer_id":"3937..faf0","peer_address":"135.181.2.250:41402"},"target":"casper_node::components::small_network"}
Nov  5 15:16:15 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:16:15.753","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection closed","peer_id":"3937..faf0","peer_address":"135.181.2.250:34553"},"target":"casper_node::components::small_network"}
Nov  5 15:17:48 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:17:48.079","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection failed","peer_id":"e8f1..99eb","peer_address":"164.68.114.22:34553","err":"failed to send message"},"target":"casper_node::components::small_network"}
Nov  5 15:19:48 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:19:48.185","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection failed","peer_id":"3024..ecc5","peer_address":"164.68.114.22:34553","err":"failed to send message"},"target":"casper_node::components::small_network"}
Nov  5 15:21:48 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:21:48.511","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection failed","peer_id":"7ddf..85d6","peer_address":"164.68.114.22:34553","err":"failed to send message"},"target":"casper_node::components::small_network"}
Nov  5 15:23:49 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:23:49.251","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection failed","peer_id":"fe6a..d28b","peer_address":"164.68.114.22:34553","err":"failed to send message"},"target":"casper_node::components::small_network"}
Nov  5 15:35:32 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:35:32.681","level":"WARN","fields":{"message":"9be8..2d7b: receiving message failed, closing connection","err":"Connection reset by peer (os error 104)","peer_id":"1a63..5169"},"target":"casper_node::components::small_network"}
Nov  5 15:35:32 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:35:32.681","level":"INFO","fields":{"message":"9be8..2d7b: connection closed","peer_id":"1a63..5169","peer_address":"135.181.2.250:37742"},"target":"casper_node::components::small_network"}
Nov  5 15:35:32 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:35:32.681","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection closed","peer_id":"1a63..5169","peer_address":"135.181.2.250:34553"},"target":"casper_node::components::small_network"}
Nov  5 15:42:28 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:42:28.365","level":"WARN","fields":{"message":"9be8..2d7b: receiving message failed, closing connection","err":"Connection reset by peer (os error 104)","peer_id":"8cd6..a450"},"target":"casper_node::components::small_network"}
Nov  5 15:42:28 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:42:28.365","level":"INFO","fields":{"message":"9be8..2d7b: connection closed","peer_id":"8cd6..a450","peer_address":"164.68.114.22:41120"},"target":"casper_node::components::small_network"}
Nov  5 15:42:28 casper-node casper-node[7048]: {"timestamp":"Nov 05 15:42:28.365","level":"WARN","fields":{"message":"9be8..2d7b: outgoing connection closed","peer_id":"8cd6..a450","peer_address":"164.68.114.22:34553"},"target":"casper_node::components::small_network"}

Which soon turned into a ton of errors of this form:

Nov  5 17:00:13 casper-node casper-node[7048]: {"timestamp":"Nov 05 17:00:13.753","level":"ERROR","fields":{"message":"invalid incoming message to consensus instance","sender":"63dc..34cb","error":"asymmetric key error: failed to construct Ed25519 signature from []"},"target":"casper_node::components::consensus::era_supervisor"}
Nov  5 17:00:13 casper-node casper-node[7048]: {"timestamp":"Nov 05 17:00:13.754","level":"ERROR","fields":{"message":"invalid incoming message to consensus instance","sender":"f1f7..7af1","error":"asymmetric key error: failed to construct Ed25519 signature from []"},"target":"casper_node::components::consensus::era_supervisor"}
Nov  5 17:00:13 casper-node casper-node[7048]: {"timestamp":"Nov 05 17:00:13.754","level":"ERROR","fields":{"message":"invalid incoming message to consensus instance","sender":"6ab1..2807","error":"asymmetric key error: failed to construct Ed25519 signature from []"},"target":"casper_node::components::consensus::era_supervisor"}
Nov  5 17:00:13 casper-node casper-node[7048]: {"timestamp":"Nov 05 17:00:13.754","level":"ERROR","fields":{"message":"invalid incoming message to consensus instance","sender":"d353..5ddf","error":"asymmetric key error: failed to construct Ed25519 signature from []"},"target":"casper_node::components::consensus::era_supervisor"}
Nov  5 17:00:13 casper-node casper-node[7048]: {"timestamp":"Nov 05 17:00:13.754","level":"ERROR","fields":{"message":"invalid incoming message to consensus instance","sender":"0269..7cfd","error":"asymmetric key error: failed to construct Ed25519 signature from []"},"target":"casper_node::components::consensus::era_supervisor"}
Nov  5 17:00:13 casper-node casper-node[7048]: {"timestamp":"Nov 05 17:00:13.754","level":"ERROR","fields":{"message":"invalid incoming message to consensus instance","sender":"da45..ead8","error":"asymmetric key error: failed to construct Ed25519 signature from []"},"target":"casper_node::components::consensus::era_supervisor"}
Nov  5 17:00:13 casper-node casper-node[7048]: {"timestamp":"Nov 05 17:00:13.754","level":"ERROR","fields":{"message":"invalid incoming message to consensus instance","sender":"ff54..4652","error":"asymmetric key error: failed to construct Ed25519 signature from []"},"target":"casper_node::components::consensus::era_supervisor"}
Nov  5 17:00:13 casper-node casper-node[7048]: {"timestamp":"Nov 05 17:00:13.755","level":"ERROR","fields":{"message":"invalid incoming message to consensus instance","sender":"1be7..3688","error":"asymmetric key error: failed to construct Ed25519 signature from []"},"target":"casper_node::components::consensus::era_supervisor"}
Nov  5 17:00:13 casper-node casper-node[7048]: {"timestamp":"Nov 05 17:00:13.764","level":"ERROR","fields":{"message":"invalid incoming message to consensus instance","sender":"cae0..79ee","error":"asymmetric key error: failed to construct Ed25519 signature from []"},"target":"casper_node::components::consensus::era_supervisor"}
Nov  5 17:00:13 casper-node casper-node[7048]: {"timestamp":"Nov 05 17:00:13.775","level":"ERROR","fields":{"message":"invalid incoming message to consensus instance","sender":"fc9b..46a9","error":"asymmetric key error: failed to construct Ed25519 signature from []"},"target":"casper_node::components::consensus::era_supervisor"}

Which ended in a panic of this form:

Nov  5 18:06:21 casper-node casper-node[24645]:    0: casper_node::panic_hook
Nov  5 18:06:21 casper-node casper-node[24645]:    1: core::ops::function::Fn::call
Nov  5 18:06:21 casper-node casper-node[24645]:    2: std::panicking::rust_panic_with_hook
Nov  5 18:06:21 casper-node casper-node[24645]:              at rustc/0cd7ff7ddfb75a38dca81ad3e76b1e984129e939/src/libstd/panicking.rs:530:17
Nov  5 18:06:21 casper-node casper-node[24645]:    3: rust_begin_unwind
Nov  5 18:06:21 casper-node casper-node[24645]:              at rustc/0cd7ff7ddfb75a38dca81ad3e76b1e984129e939/src/libstd/panicking.rs:437:5
Nov  5 18:06:21 casper-node casper-node[24645]:    4: std::panicking::begin_panic_fmt
Nov  5 18:06:21 casper-node casper-node[24645]:              at rustc/0cd7ff7ddfb75a38dca81ad3e76b1e984129e939/src/libstd/panicking.rs:391:5
Nov  5 18:06:21 casper-node casper-node[24645]:    5: casper_node::effect::EffectBuilder<REv>::fatal::{{closure}}
Nov  5 18:06:21 casper-node casper-node[24645]:    6: <futures_util::future::future::Map<Fut,F> as core::future::future::Future>::poll
Nov  5 18:06:21 casper-node casper-node[24645]:    7: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov  5 18:06:21 casper-node casper-node[24645]:    8: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov  5 18:06:21 casper-node casper-node[24645]:    9: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
Nov  5 18:06:21 casper-node casper-node[24645]:   10: tokio::runtime::task::raw::poll
Nov  5 18:06:21 casper-node casper-node[24645]:   11: std::thread::local::LocalKey<T>::with
Nov  5 18:06:21 casper-node casper-node[24645]:   12: tokio::runtime::thread_pool::worker::Context::run_task
Nov  5 18:06:21 casper-node casper-node[24645]:   13: tokio::runtime::thread_pool::worker::Context::run
Nov  5 18:06:21 casper-node casper-node[24645]:   14: tokio::macros::scoped_tls::ScopedKey<T>::set
Nov  5 18:06:21 casper-node casper-node[24645]:   15: tokio::runtime::thread_pool::worker::run
Nov  5 18:06:21 casper-node casper-node[24645]:   16: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
Nov  5 18:06:21 casper-node casper-node[24645]:   17: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
Nov  5 18:06:21 casper-node casper-node[24645]:   18: tokio::runtime::task::harness::Harness<T,S>::poll
Nov  5 18:06:21 casper-node casper-node[24645]:   19: tokio::runtime::blocking::pool::Inner::run
Nov  5 18:06:21 casper-node casper-node[24645]:   20: tokio::runtime::context::enter
Nov  5 18:06:21 casper-node casper-node[24645]:   21: tokio::runtime::handle::Handle::enter
Nov  5 18:06:21 casper-node casper-node[24645]:   22: std::sys_common::backtrace::__rust_begin_short_backtrace
Nov  5 18:06:21 casper-node casper-node[24645]:   23: core::ops::function::FnOnce::call_once{{vtable.shim}}
Nov  5 18:06:21 casper-node casper-node[24645]:   24: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once
Nov  5 18:06:21 casper-node casper-node[24645]:              at rustc/0cd7ff7ddfb75a38dca81ad3e76b1e984129e939/src/liballoc/boxed.rs:1081:9
Nov  5 18:06:21 casper-node casper-node[24645]:       <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once
Nov  5 18:06:21 casper-node casper-node[24645]:              at rustc/0cd7ff7ddfb75a38dca81ad3e76b1e984129e939/src/liballoc/boxed.rs:1081:9
Nov  5 18:06:21 casper-node casper-node[24645]:       std::sys::unix::thread::Thread::new::thread_start
Nov  5 18:06:21 casper-node casper-node[24645]:              at rustc/0cd7ff7ddfb75a38dca81ad3e76b1e984129e939/src/libstd/sys/unix/thread.rs:87:17
Nov  5 18:06:21 casper-node casper-node[24645]:   25: start_thread
Nov  5 18:06:21 casper-node casper-node[24645]:   26: clone
Nov  5 18:06:21 casper-node casper-node[24645]: panicked at 'fatal error [node/src/components/consensus/era_supervisor.rs:724]: Consensus shutting down due to inability to participate in the network', node/src/effect.rs:366:9
Fraser999 commented 4 years ago

Some further info:

The initial batch of log lines hint at connections being rejected by the peer.

Grepping the Casper nodes' logs for TLS handshake failed yielded the following on node N0 only:

Nov  5 20:04:41 ec2-uw1-1a-p-testnet-rustN0 casper-node[13971]: {"timestamp":"Nov 05 20:04:41.993","level":"WARN","fields":{"message":"ff54..4652: TLS handshake failed","peer_address":"34.72.74.228:45346","err":"handshake error: the handshake failed: error:1408F09C:SSL routines:ssl3_get_record:http request:ssl/record/ssl3_record.c:322:"},"target":"casper_node::components::small_network"}
Nov  5 20:07:11 ec2-uw1-1a-p-testnet-rustN0 casper-node[13971]: {"timestamp":"Nov 05 20:07:11.116","level":"WARN","fields":{"message":"ff54..4652: TLS handshake failed","peer_address":"136.37.195.206:54027","err":"handshake error: the handshake failed: error:1408F09C:SSL routines:ssl3_get_record:http request:ssl/record/ssl3_record.c:322:"},"target":"casper_node::components::small_network"}

Running curl http://34.72.74.228:7777/status | jq shows that this node has now been able to connect successfully to N0 (I can see "ff54..4652": "54.177.84.9:34553" in the list of peers), so the failure was transient.


On each of N0, N1 and N2, we also have the following lines:

Nov  5 17:25:29 ec2-uw1-1a-p-testnet-rustN0 casper-node[13971]: {"timestamp":"Nov 05 17:25:29.224","level":"WARN","fields":{"message":"ff54..4652: receiving message failed, closing connection","err":"Connection reset by peer (os error 104)","peer_id":"9be8..2d7b"},"target":"casper_node::components::small_network"}
Nov  5 17:25:29 ec2-uw1-1a-p-testnet-rustN0 casper-node[13971]: {"timestamp":"Nov 05 17:25:29.224","level":"INFO","fields":{"message":"ff54..4652: connection closed","peer_id":"9be8..2d7b","peer_address":"34.72.74.228:43776"},"target":"casper_node::components::small_network"}
Nov  5 17:25:29 ec2-uw1-1a-p-testnet-rustN0 casper-node[13971]: {"timestamp":"Nov 05 17:25:29.224","level":"WARN","fields":{"message":"ff54..4652: outgoing connection closed","peer_id":"9be8..2d7b","peer_address":"34.72.74.228:34553"},"target":"casper_node::components::small_network"}

All three report the same messages with identical timestamps. This could tie in with 9be8..2d7b panicking, however the panic reported in the OP is timestamped 18:06:21 whereas our nodes report this at 17:25:29.

MParlikar commented 3 years ago

Transient issue, closing