prysmaticlabs / prysm

Go implementation of Ethereum proof of stake
https://www.offchainlabs.com
GNU General Public License v3.0
3.46k stars 998 forks source link

Validator process exits when "Waiting for beacon chain start log from the ETH 1.0 deposit contract" #8369

Closed augustohalvarez closed 3 years ago

augustohalvarez commented 3 years ago

🐞 Bug Report

Description

With Geth node and beacon chain fully synced and running, starting the validator process exits immediately after the log "Waiting for beacon chain start log from the ETH 1.0 deposit contract"

Has this worked before in a previous version?

Unclear. Same commands worked on Pyrmont testnet, though this is the first time running on mainnet ## 🔬 Minimal Reproduction

successfully imported validator_keys into the prysm wallet. The accounts list shows just one, and the pub key matches what I'm seeing in beaconcha.in (btw, validator is waiting to be activated) https://beaconcha.in/validator/9075fe0036a41dc6694e4ab0526b9e4fb8620e811135a07d2440f529c62d163c0758f0ae56e5711b1347d9f7367213b0#deposits

Geth was started as a systemd service:

[Unit]
Description     = geth eth1 service
Wants           = network-online.target
After           = network-online.target

[Service]
User            = augusto
ExecStart       = /usr/bin/geth --http --metrics --pprof
Restart         = on-failure
RestartSec      = 3

[Install]
WantedBy    = multi-user.target

The Geth javascript consle shows eth.syncing as false and the service seems to be running okay. Here's a snippet of the logs when restarting the service: eth1.log

Beacon chain started as systemd service:

[Unit]
Description     = eth2 beacon chain service
Wants           = network-online.target
After           = network-online.target

[Service]
Type            = simple
User            = augusto
ExecStart       = /home/augusto/prysm/prysm.sh beacon-chain --mainnet --p2p-max-peers=75 --monitoring-host="0.0.0.0" --http-web3provider=http://127.0.0.1:8545 --accept-terms-of-use
Restart         = on-failure

[Install]
WantedBy    = multi-user.target

Appears to be fully synced and working as expected from looking at the logs. Here's a snippet of restarting the service: beacon-chain.log

Validator is also a systemd service:

[Unit]
Description     = eth2 validator service
Wants           = network-online.target beacon-chain.service
After           = network-online.target

[Service]
User            = augusto
ExecStart       = /home/augusto/prysm/prysm.sh validator --mainnet --graffiti "Me and my franklins here, we take off!" --accept-terms-of-use --wallet-password-file /home/augusto/.eth2validators/validators-password.txt --verbosity=trace
Restart         = no

[Install]
WantedBy= multi-user.target

All three services started the same way systemctl start {service name} With both the Geth node and beacon chain synced and running, starting the validator gives the following output:

🔥 Error




Jan 30 09:47:21 augusto-NUC8i5BEK systemd[1]: Started eth2 validator service.
Jan 30 09:47:22 augusto-NUC8i5BEK prysm.sh[13100]: Latest Prysm version is v1.1.0.
Jan 30 09:47:22 augusto-NUC8i5BEK prysm.sh[13100]: Validator is up to date.
Jan 30 09:47:22 augusto-NUC8i5BEK prysm.sh[13100]: Verifying binary integrity.
Jan 30 09:47:22 augusto-NUC8i5BEK prysm.sh[13100]: validator-v1.1.0-linux-amd64: OK
Jan 30 09:47:22 augusto-NUC8i5BEK prysm.sh[13100]: gpg: Signature made Mon 18 Jan 2021 01:04:00 PM PST using RSA key ID F1A5036E
Jan 30 09:47:22 augusto-NUC8i5BEK prysm.sh[13100]: gpg: Good signature from "Preston Van Loon "
Jan 30 09:47:22 augusto-NUC8i5BEK prysm.sh[13100]: gpg: WARNING: This key is not certified with a trusted signature!
Jan 30 09:47:22 augusto-NUC8i5BEK prysm.sh[13100]: gpg:          There is no indication that the signature belongs to the owner.
Jan 30 09:47:22 augusto-NUC8i5BEK prysm.sh[13100]: Primary key fingerprint: 0AE0 051D 647B A3C1 A917  AF40 72E3 3E4D F1A5 036E
Jan 30 09:47:22 augusto-NUC8i5BEK prysm.sh[13100]: Verified /home/augusto/prysm/dist/validator-v1.1.0-linux-amd64 has been signed by Prysmatic Labs.
Jan 30 09:47:22 augusto-NUC8i5BEK prysm.sh[13100]: Starting Prysm validator --mainnet --graffiti Me and my franklins here, we take off! --accept-terms-of-use --wallet-password-file /home/augusto/.eth2validators/validators-password.txt --verbosity=trace
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=warning msg="Running on ETH2 Mainnet" prefix=flags
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=info msg="(wallet path) /home/augusto/.eth2validators/prysm-wallet-v2" prefix=prompt
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=info msg="Opened validator wallet" keymanager-kind=direct prefix=node wallet="/home/augusto/.eth2validators/prysm-wallet-v2/direct"
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=info msg="Checking DB" databasePath="/home/augusto/.eth2validators/prysm-wallet-v2/direct" prefix=node
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=info msg="Starting validator node" prefix=node version="Prysm/v1.1.0/9b367b36fc12ecf565ad649209aa2b5bba8c7797. Built at: 2021-01-18 19:47:14+00:00"
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=debug msg="Starting 2 services: [*prometheus.Service *client.ValidatorService]" prefix=registry
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=debug msg="Starting service type *prometheus.Service" prefix=registry
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=debug msg="Starting service type *client.ValidatorService" prefix=registry
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=warning msg="You are using an insecure gRPC connection. If you are running your beacon node and validator on the same machines, you can ignore this message. If you want to know how to enable secure connections, see: https://docs.prylabs.network/docs/prysm-usage/secure-grpc" prefix=validator
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=debug msg="Starting prometheus service" address="127.0.0.1:8081" prefix=prometheus
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=info msg="Validating for public key" prefix=validator publicKey=0x9075fe0036a4
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=debug msg="gRPC stream started." backend=[] method="/ethereum.eth.v1alpha1.BeaconNodeValidator/WaitForChainStart"
Jan 30 09:47:23 augusto-NUC8i5BEK prysm.sh[13100]: time="2021-01-30 09:47:23" level=info msg="Waiting for beacon chain start log from the ETH 1.0 deposit contract" prefix=validator
Jan 30 09:47:23 augusto-NUC8i5BEK systemd[1]: validator.service: Main process exited, code=exited, status=1/FAILURE
Jan 30 09:47:23 augusto-NUC8i5BEK systemd[1]: validator.service: Unit entered failed state.
Jan 30 09:47:23 augusto-NUC8i5BEK systemd[1]: validator.service: Failed with result 'exit-code'.

🌍 Your Environment

Operating System:

  
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.7 LTS
Release:    16.04
Codename:   xenial
  

What version of Prysm are you running? (Which release)

  
beacon-chain version Prysm/v1.1.0/9b367b36fc12ecf565ad649209aa2b5bba8c7797. Built at: 2021-01-18 19:47:14+00:00
validator version Prysm/v1.1.0/9b367b36fc12ecf565ad649209aa2b5bba8c7797. Built at: 2021-01-18 19:47:14+00:00
  

Anything else relevant (validator index / public key)? Validator public key: 0x9075fe0036a41dc6694e4ab0526b9e4fb8620e811135a07d2440f529c62d163c0758f0ae56e5711b1347d9f7367213b0 Also shown in link to beaconcha.in with some more info.

Some things to note: I did encounter a bit of weirdness when transitioning off the Pyrmont testnet. I had everything running smoothly for a bit so I decided to perform a voluntary exit before starting mainnet. That was a success so I went ahead and created new validator keys with the deposit cli and imported into my prysm wallet. When running validator accounts list --mainnet, it showed two accounts, the one I used for testnet and the newly imported account. I thought maybe I had left out the pyrmont flag when importing the key (still unsure), so I went ahead and deleted that account, leaving only the new mainnet one. Not sure if this did anything but thought I should mention.

I have tried a full resync of the beacon chain by deleting the data directory and restarting the service but I'm still seeing the same error.

I have been running each service in separate tmux sessions but I've tried running them in separate terminal windows and still seeing the same error.

Please let me know if you need any more info. I realize this may be hard to reproduce so any help or further direction on how to debug would be immensely appreciated. Thanks for all the hard work and maintaining the discord channel, it's been incredibly helpful so far!

nisdas commented 3 years ago

@augustohalvarez You can try deleting your validator db either manually or with --clear-dbin your validator client. I am assuming the fact that you are running the same validator directory for both the testnet/mainnet might have caused issues here. Let me know if after that your validator is back up and running.

augustohalvarez commented 3 years ago

Thanks so much @nisdas, looks like it's working as expected since I'm seeing this in the logs now: Feb 01 13:42:54 augusto-NUC8i5BEK prysm.sh[10632]: time="2021-02-01 13:42:54" level=info msg="Deposit processed, entering activation queue after finalization" index=88277 positionInActivationQueue=11227 prefix=validator pubKey=0x9075fe0036a4 status=DEPOSITED

Really appreciate the suggestion! Going to close the issue now.