Open nagisa opened 1 year ago
As I see it, the problem is that it's hard for neard to know that there actually is a problem, because what if I actually meant to use that account ID? If the network was initialized with 2 validators, node0
and node1
, and I start a node with a validator key that shows "account_id": "nodee0"
, it's clear that I just made a mistake. But how can neard generate a warning saying something like WARN: you made a typo in your validator key file
, when I might actually have meant to use that account? Maybe what we could do is amend the Starting validator node
message to be more like:
2023-03-21T17:39:38.134516Z INFO client: Starting validator node: {validator's account_id}. This account is not a current block or chunk producer.
In my case the account didn’t exist at all anywhere in the network, and I’d imagine there are plenty of other failure modes – correct account_id but an invalid public key for example; or public/private key pair being invalid. My intuition is that cases like these should be straightforward to check even against just local rocksdb copy (it stores the list of validators, doesn't it?). In this case false positives are probably fine as well – the only outcome is a log message inviting the operator to double check.
That said all this might be overcomplicating the solution. We have Validator |
in the stats
message. Would adding NOT Validator |
if the node has validator_key.json
set up but isn’t validating be sufficient? Possible reasons for why the message shows up like that could be documented separately at that point, we could write a troubleshooting guide at that point or whatever.
near state $ACCOUNT_ID
;
When setting up a mainnet fork, I accidentally put a wrong
account_id
of the validator intovalidator_key.json
. While the periodicstats
messages did indicate that this node is not operating as a validator, there was no other indication, error or warning explaining what might have gone wrong.We do have a message like this on startup:
neard also clearly knows that it is being a validator or not (as evidenced by it being able to put the
Validator |
snippet in the periodictarget: stats
event). We should be able to figure out that something has gone wrong in this case and output at least a warning message (ideally explaining why.)