Open GeoJesse opened 4 years ago
This might be related to #1488
I'm not sure if it's related or not but it appears that it's only an issue if you haven't already fully sync'd. My beacon node just finished syncing so I tried rebooting the server again. This time it picked up with no problem. So it doesn't look like an issue for me anymore but it could be a bug for people who are trying to sync the first time.
After Jesse showed me this, I tested it out on my own Ubuntu Desktop machine (Lighthouse v0.2.6-ebb25b55
) and was able to reproduce the issue.
Here's where I stop the beacon node while it is syncing:
Aug 21 21:24:50 jon-desktop lighthouse[6885]: Aug 21 21:24:50.000 INFO Syncing est_time: 1 hr 55 mins, speed: 17.98 slots/sec, distance: 125132 slots (2 weeks 3 days), peers: 2, service: slot_notifier
Aug 21 21:25:02 jon-desktop lighthouse[6885]: Aug 21 21:25:02.000 INFO Syncing est_time: 1 hr 55 mins, speed: 18.00 slots/sec, distance: 124940 slots (2 weeks 3 days), peers: 2, service: slot_notifier
Aug 21 21:25:14 jon-desktop lighthouse[6885]: Aug 21 21:25:14.000 INFO Syncing est_time: 1 hr 37 mins, speed: 21.33 slots/sec, distance: 124685 slots (2 weeks 3 days), peers: 4, service: slot_notifier
Aug 21 21:25:22 jon-desktop systemd[1]: Stopping Lighthouse Beacon Node...
Aug 21 21:25:22 jon-desktop lighthouse[6885]: Aug 21 21:25:22.463 INFO Shutting down..
Aug 21 21:25:22 jon-desktop lighthouse[6885]: Aug 21 21:25:22.464 INFO HTTP service shutdown service: http
Aug 21 21:25:22 jon-desktop lighthouse[6885]: Aug 21 21:25:22.466 INFO Saved DHT state service: network
Aug 21 21:25:22 jon-desktop lighthouse[6885]: Aug 21 21:25:22.466 INFO Network service shutdown service: network
Aug 21 21:25:24 jon-desktop systemd[1]: lighthousebeacon.service: Succeeded.
Aug 21 21:25:24 jon-desktop systemd[1]: Stopped Lighthouse Beacon Node.
Here's the log after starting again:
Aug 21 21:25:49 jon-desktop systemd[1]: Started Lighthouse Beacon Node.
Aug 21 21:25:49 jon-desktop lighthouse[6965]: Aug 21 21:25:49.894 WARN Ethereum 2.0 is pre-release. This software is experimental.
Aug 21 21:25:49 jon-desktop lighthouse[6965]: Aug 21 21:25:49.894 INFO Lighthouse started version: Lighthouse/v0.2.6-ebb25b55
Aug 21 21:25:49 jon-desktop lighthouse[6965]: Aug 21 21:25:49.894 INFO Configured for testnet name: medalla
Aug 21 21:25:49 jon-desktop lighthouse[6965]: Aug 21 21:25:49.894 INFO Data directory initialised datadir: /eth/data/lighthouse/beacon-node
Aug 21 21:25:50 jon-desktop lighthouse[6965]: Aug 21 21:25:50.323 INFO Starting beacon chain method: resume, service: beacon
Aug 21 21:26:12 jon-desktop lighthouse[6965]: Aug 21 21:26:12.353 INFO Block production enabled method: json rpc via http, endpoint: http://127.0.0.1:8545
Aug 21 21:26:12 jon-desktop lighthouse[6965]: Aug 21 21:26:12.418 INFO Beacon chain initialized head_slot: 1631, head_block: 0x3465…01f0, head_state: 0xbae2…b4f4, service: beacon
Aug 21 21:26:12 jon-desktop lighthouse[6965]: Aug 21 21:26:12.419 INFO Timer service started service: node_timer
Aug 21 21:26:12 jon-desktop lighthouse[6965]: Aug 21 21:26:12.420 INFO Libp2p Service peer_id: 16Uiu2HAm4wqR5HtP7AB86gDqRoxjeBuF9dEMs6LALBMtQAgefXPG, service: libp2p
Aug 21 21:26:12 jon-desktop lighthouse[6965]: Aug 21 21:26:12.420 INFO ENR Initialised tcp: Some(9000), udp: None, ip: None, id: 0xf574..a8ea, seq: 1, enr: enr:-KO4QJsGUZ8lsu4zly8XfW17mS_i-Q0SmDOLXgoMu9jy2_irZeOUboYO54DsVphNwzU_KryQXU-Z97o5ZE7V4W_UsmoBh2F0dG5ldHOIAAAAAAAAAACEZXRoMpDnp11aAAAAAf__________gmlkgnY0iXNlY3AyNTZrMaECjXNGmq3lWJ6tpKhEvKSf8cgGOHP4tIfpxfphhzpUNKODdGNwgiMo, service: libp2p
Aug 21 21:26:12 jon-desktop lighthouse[6965]: Aug 21 21:26:12.421 INFO Listening established address: /ip4/0.0.0.0/tcp/9000/p2p/16Uiu2HAm4wqR5HtP7AB86gDqRoxjeBuF9dEMs6LALBMtQAgefXPG, service: libp2p
Aug 21 21:26:12 jon-desktop lighthouse[6965]: Aug 21 21:26:12.422 INFO Subscribed to topics topics: [BeaconBlock, BeaconAggregateAndProof, VoluntaryExit, ProposerSlashing, AttesterSlashing], service: libp2p
Aug 21 21:26:12 jon-desktop lighthouse[6965]: Aug 21 21:26:12.464 INFO HTTP API started port: 5052, address: 127.0.0.1, service: http
Aug 21 21:26:26 jon-desktop lighthouse[6965]: Aug 21 21:26:26.001 WARN Low peer count peer_count: 0, service: slot_notifier
Aug 21 21:26:26 jon-desktop lighthouse[6965]: Aug 21 21:26:26.001 INFO Searching for peers current_slot: 126131, head_slot: 1631, finalized_epoch: 48, finalized_root: 0x952a…39bb, peers: 0, service: slot_notifier
Aug 21 21:26:37 jon-desktop lighthouse[6965]: Aug 21 21:26:37.729 INFO Sync state updated new_state: Syncing Finalized Chain, old_state: Stalled, service: sync
Aug 21 21:26:38 jon-desktop lighthouse[6965]: Aug 21 21:26:38.001 INFO Syncing est_time: --, distance: 124501 slots (2 weeks 3 days), peers: 4, service: slot_notifier
Aug 21 21:26:47 jon-desktop lighthouse[6965]: Aug 21 21:26:47.836 WARN Batch failed to download. Dropping chain scoring peers, batch_epoch: 48, score_adjustment: Low Tolerance Error, chain: 1582718186899813585, service: sync
Aug 21 21:26:50 jon-desktop lighthouse[6965]: Aug 21 21:26:50.001 INFO Syncing est_time: --, distance: 124502 slots (2 weeks 3 days), peers: 5, service: slot_notifier
Aug 21 21:27:02 jon-desktop lighthouse[6965]: Aug 21 21:27:02.002 INFO Syncing est_time: --, distance: 124503 slots (2 weeks 3 days), peers: 5, service: slot_notifier
Aug 21 21:27:14 jon-desktop lighthouse[6965]: Aug 21 21:27:14.001 INFO Syncing est_time: --, distance: 124504 slots (2 weeks 3 days), peers: 5, service: slot_notifier
Aug 21 21:27:26 jon-desktop lighthouse[6965]: Aug 21 21:27:26.001 INFO Syncing est_time: --, distance: 124505 slots (2 weeks 3 days), peers: 5, service: slot_notifier
Aug 21 21:27:28 jon-desktop lighthouse[6965]: Aug 21 21:27:28.376 WARN Batch failed to download. Dropping chain scoring peers, batch_epoch: 48, score_adjustment: Low Tolerance Error, chain: 16513122307734969754, service: sync
Notice that after restarting the beacon, the process will remain as "Syncing" with est_time: --
The only fix we've been able to use is to clear the beacon
directory and restart. Restarting after it has reached the "Synced" state works fine.
Having the same issue.
Aug 24 10:20:50.000 INFO Syncing est_time: --, distance: 37488 slots (5 days 4 hrs), peers: 41, service: slot_notifier
Aug 24 10:21:02.000 INFO Syncing est_time: --, distance: 37489 slots (5 days 4 hrs), peers: 43, service: slot_notifier
Aug 24 10:21:14.007 INFO Syncing est_time: --, distance: 37490 slots (5 days 4 hrs), peers: 43, service: slot_notifier
Aug 24 10:21:26.000 INFO Syncing est_time: --, distance: 37491 slots (5 days 4 hrs), peers: 42, service: slot_notifier
Aug 24 10:21:38.001 INFO Syncing est_time: --, distance: 37492 slots (5 days 4 hrs), peers: 42, service: slot_notifier
Aug 24 10:21:50.001 INFO Syncing est_time: --, distance: 37493 slots (5 days 4 hrs), peers: 43, service: slot_notifier
Aug 24 10:22:02.001 INFO Syncing est_time: --, distance: 37494 slots (5 days 4 hrs), peers: 41, service: slot_notifier
Aug 24 10:22:14.000 INFO Syncing est_time: --, distance: 37495 slots (5 days 4 hrs), peers: 40, service: slot_notifier
Aug 24 10:22:26.000 INFO Syncing est_time: --, distance: 37496 slots (5 days 4 hrs), peers: 41, service: slot_notifier
Aug 24 10:22:38.001 INFO Syncing est_time: --, distance: 37497 slots (5 days 4 hrs), peers: 40, service: slot_notifier
Aug 24 10:22:50.001 INFO Syncing est_time: --, distance: 37498 slots (5 days 4 hrs), peers: 37, service: slot_notifier
Aug 24 10:23:02.001 INFO Syncing est_time: --, distance: 37499 slots (5 days 4 hrs), peers: 37, service: slot_notifier
Me too. My DB was 105 GB in size, so I installed v0.2.7
and ran --purge-db
and now I am stuck in this cycle.
Update: I ran --purge-db
again and for now it is working.
I was following directions for docker on the official docs and ran into this issue.
Based on reading the above, I was able to fix it by:
geth
to fully syncdocker-compose down
mv lighthouse-data/beacon lighthouse-data/beacon_old
(just to be safe)docker-compose up -d
Now syncing successfully. Have not tried/dared to stop it.
There is also a new sync PR coming which should address most issues related to the medalla large period of non-finality.
It seems this is still happening, lighthouse getting stuck after a restart when using genesis sync
@icculp do you have any logs you could send to help diagnose this?
Description
I followed the tutorial here for staking ethereum with the exception of using a Raspberry Pi
Hardware: Raspberry Pi Model B Samsung External SSD T5 (lighthouse and geth db's stored here) OS: Ubuntu v20.04 (LTS) x64 ARM Software: Geth 1.9.19-stable-3e064192 ARM 64 Lighthouse v0.2.6-2bc9115; BLS Library: blst
Version
Lighthouse v0.2.6-2bc9115; BLS Library: blst rustc 1.45.2 (d3fb005a3 2020-07-31)
Please provide your Lighthouse and Rust version. Are you building from
master
, which commit? Pulled from 'master'Present Behaviour
Whenever there is a reboot the syncing stops and gets stuck on that last slot. The services all get restarted but the beacon service never picks up where it left off and 'est-time' in the logs is '--'.
Expected Behaviour
After rebooting I would expect the beacon service to take over where it left off.
Steps to resolve
Please describe the steps required to resolve this issue, if known. The only way to get it up and running again is to stop the beacon sewrvice, purge the beacon database, then restart. However this causes it to start all over from the beginning.