NethermindEth / sedge

A one-click setup tool for PoS network/chain validators and nodes.
https://docs.sedge.nethermind.io
Apache License 2.0
160 stars 45 forks source link

Error fetching finalized block #202

Open critesjosh opened 1 year ago

critesjosh commented 1 year ago

Describe the bug Error fetching finalized block from remote using checkpoint.gnosischain.com.

execution-client  | 2022-12-27 16:44:10.9936|Waiting for peers... 43s 
execution-client  | 2022-12-27 16:44:11.9929|Waiting for peers... 44s 
consensus-client  | Dec 27 16:44:12.244 CRIT Failed to start beacon node             reason: Error fetching finalized block from remote: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("checkpoint.gnosischain.com")), port: None, path: "/eth/v2/beacon/blocks/finalized", query: None, fragment: None }, source: TimedOut })
consensus-client  | Dec 27 16:44:12.244 INFO Internal shutdown received              reason: Failed to start beacon node
consensus-client  | Dec 27 16:44:12.244 INFO Shutting down..                         reason: Failure("Failed to start beacon node")
consensus-client  | Failed to start beacon node
consensus-client exited with code 1
execution-client  | 2022-12-27 16:44:12.9938|Waiting for peers... 45s 
consensus-client  | Dec 27 16:42:12.195 INFO Logging to file                         path: "/var/lib/lighthouse/beacon/logs/beacon.log"
consensus-client  | Dec 27 16:42:12.195 INFO Lighthouse started                      version: Lighthouse/v3.3.0-bf533c8
consensus-client  | Dec 27 16:42:12.195 INFO Configured for network                  name: gnosis
consensus-client  | Dec 27 16:42:12.195 INFO Data directory initialised              datadir: /var/lib/lighthouse
consensus-client  | Dec 27 16:42:12.196 WARN Ignoring --eth1-endpoints flag          info: the value for --execution-endpoint will be used instead. --eth1-endpoints has been deprecated for post-merge configurations
consensus-client  | Dec 27 16:42:12.196 INFO Deposit contract                        address: 0x0b98057ea310f4d31f2a452b414647007d1645d9, deploy_block: 19469077
consensus-client  | Dec 27 16:42:12.225 INFO Starting checkpoint sync                remote_url: https://checkpoint.gnosischain.com/, service: beacon
consensus-client  | Dec 27 16:43:12.236 WARN Remote BN does not support EIP-4881 fast deposit sync, error: Error fetching deposit snapshot from remote: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("checkpoint.gnosischain.com")), port: None, path: "/eth/v1/beacon/deposit_snapshot", query: None, fragment: None }, source: TimedOut }), service: beacon
consensus-client  | Dec 27 16:44:12.244 CRIT Failed to start beacon node             reason: Error fetching finalized block from remote: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("checkpoint.gnosischain.com")), port: None, path: "/eth/v2/beacon/blocks/finalized", query: None, fragment: None }, source: TimedOut })
consensus-client  | Dec 27 16:44:12.244 INFO Internal shutdown received              reason: Failed to start beacon node
consensus-client  | Dec 27 16:44:12.244 INFO Shutting down..                         reason: Failure("Failed to start beacon node")
consensus-client  | Failed to start beacon node
consensus-client  | Dec 27 16:44:13.036 INFO Logging to file                         path: "/var/lib/lighthouse/beacon/logs/beacon.log"
consensus-client  | Dec 27 16:44:13.037 INFO Lighthouse started                      version: Lighthouse/v3.3.0-bf533c8
consensus-client  | Dec 27 16:44:13.037 INFO Configured for network                  name: gnosis
consensus-client  | Dec 27 16:44:13.037 INFO Data directory initialised              datadir: /var/lib/lighthouse
consensus-client  | Dec 27 16:44:13.037 WARN Ignoring --eth1-endpoints flag          info: the value for --execution-endpoint will be used instead. --eth1-endpoints has been deprecated for post-merge configurations
consensus-client  | Dec 27 16:44:13.038 INFO Deposit contract                        address: 0x0b98057ea310f4d31f2a452b414647007d1645d9, deploy_block: 19469077
consensus-client  | Dec 27 16:44:13.085 INFO Starting checkpoint sync                remote_url: https://checkpoint.gnosischain.com/, service: beacon
execution-client  | 2022-12-27 16:44:13.9938|Peers | with known best block: 0 | all: 0 | 

To Reproduce Steps to reproduce the behavior:

  1. setup sedge with the following .env
    # --- Global configuration ---
    EL_NETWORK=xdai
    CL_NETWORK=gnosis
    # --- Execution Layer - Execution Node - configuration ---
    EC_IMAGE_VERSION=nethermind/nethermind:1.14.7
    NETHERMIND_LOG_LEVEL=INFO
    EC_ENABLED_MODULES=[Web3,Eth,Subscribe,Net,]
    EC_NODENAME=Nethermind
    NETHERMIND_METRICS_PUSH_GATEWAY_URL=http://localhost:9090/metrics
    NETHERMIND_PRUNING_CACHEMB=2048
    EC_DATA_DIR="/media/Extreme SSD/gnosis-data/execution-data"
    EC_SNAP_SYNC_ENABLED=false
    EC_JWT_SECRET_PATH=/home/user/gnosis-validator/docker-compose-scripts/jwtsecret
    # --- Consensus Layer - Beacon Node - configuration ---
    CC_PEER_COUNT=50
    CC_LOG_LEVEL=info
    EC_API_URL=http://execution:8545
    EC_AUTH_URL=http://execution:8551
    CC_INSTANCE_NAME=Lighthouse
    CC_IMAGE_VERSION=sigp/lighthouse:v3.3.0
    CC_DATA_DIR="/media/Extreme SSD/gnosis-data/consensus-data"
    CC_JWT_SECRET_PATH=/home/user/gnosis-validator/docker-compose-scripts/jwtsecret
    CL_FEE_RECIPIENT=0x7D678b9218aC289e0C9F18c82F546c988BfE3022
    CHECKPOINT_SYNC_URL=https://checkpoint.gnosischain.com
    CL_BOOTNODES="enr:-Iq4QMCTfIMXnow27baRUb35Q8iiFHSIDBJh6hQM5Axohhf4b6Kr_cOCu0htQ5WvVqKvFgY28893DHAg8gnBAXsAVqmGAX53x8JggmlkgnY0gmlwhLKAlv6Jc2VjcDI1NmsxoQK6S-Cii_KmfFdUJL2TANL3ksaKUnNXvTCv1tLwXs0QgIN1ZHCCIyk,enr:-Ly4QFoZTWR8ulxGVsWydTNGdwEESueIdj-wB6UmmjUcm-AOPxnQi7wprzwcdo7-1jBW_JxELlUKJdJES8TDsbl1EdNlh2F0dG5ldHOI__78_v2bsV-EZXRoMpA2-lATkAAAcf__________gmlkgnY0gmlwhBLYJjGJc2VjcDI1NmsxoQI0gujXac9rMAb48NtMqtSTyHIeNYlpjkbYpWJw46PmYYhzeW5jbmV0cw-DdGNwgiMog3VkcIIjKA,enr:-KG4QE5OIg5ThTjkzrlVF32WT_-XT14WeJtIz2zoTqLLjQhYAmJlnk4ItSoH41_2x0RX0wTFIe5GgjRzU2u7Q1fN4vADhGV0aDKQqP7o7pAAAHAyAAAAAAAAAIJpZIJ2NIJpcISlFsStiXNlY3AyNTZrMaEC-Rrd_bBZwhKpXzFCrStKp1q_HmGOewxY3KwM8ofAj_ODdGNwgiMog3VkcIIjKA,enr:-L64QC9Hhov4DhQ7mRukTOz4_jHm4DHlGL726NWH4ojH1wFgEwSin_6H95Gs6nW2fktTWbPachHJ6rUFu0iJNgA0SB2CARqHYXR0bmV0c4j__________4RldGgykDb6UBOQAABx__________-CaWSCdjSCaXCEA-2vzolzZWNwMjU2azGhA17lsUg60R776rauYMdrAz383UUgESoaHEzMkvm4K6k6iHN5bmNuZXRzD4N0Y3CCIyiDdWRwgiMo"
    # --- Consensus Layer - Validator Node - configuration ---
    CC_API_URL=http://consensus:4000
    GRAFFITI=nethermind-lighthouse
    VL_LOG_LEVEL=info
    VL_INSTANCE_NAME=LighthouseValidator
    VL_IMAGE_VERSION=sigp/lighthouse:v3.3.0
    KEYSTORE_DIR=./keystore
    VL_DATA_DIR="/media/Extereme SSD/gnosis-data/validator-data"
  2. start nodes sudo docker compose -f docker-compose.yml up -d execution consensus
  3. see errors

Expected behavior consensus node should start syncing

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

AntiD2ta commented 1 year ago

@critesjosh thanks for opening an issue. Did you happen to try the setup again in another moment? My guess here is that checkpoint.gnosischain.com was receiving a lot of traffic or was down for a while. This node is an official checkpoint sync node from Gnosis, we are not maintaining it. If you still face issues with this endpoint you can stop using a checkpoint sync endpoint:

critesjosh commented 1 year ago

thanks ill try that and get back to you.

some additional context, i followed the server config steps here https://docs.gnosischain.com/node/guide/configure-server

critesjosh commented 1 year ago

hmm now i just see bunch of errors from the consensus client about not being able to communicate with the EC

onsensus-client  | Dec 27 17:20:32.501 WARN Low peer count                          peer_count: 0, service: slot_notifier
consensus-client  | Dec 27 17:20:32.501 INFO Searching for peers                     current_slot: 6633658, head_slot: 0, finalized_epoch: 0, finalized_root: 0x2356…6a75, peers: 0, service: slot_notifier
consensus-client  | Dec 27 17:20:32.501 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
consensus-client  | Dec 27 17:20:33.502 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
consensus-client  | Dec 27 17:20:33.502 ERRO Unable to get transition config         error: Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) }, service: exec
consensus-client  | Dec 27 17:20:33.502 ERRO Not ready for merge                     hint: try updating Lighthouse and/or the execution layer, info: Could not confirm the transition configuration with the execution endpoint: "EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: \"http\", cannot_be_a_base: false, username: \"\", password: None, host: Some(Domain(\"execution\")), port: Some(8551), path: \"/\", query: None, fragment: None }, source: TimedOut }) })", service: slot_notifier
consensus-client  | Dec 27 17:20:34.504 ERRO Error during execution engine upcheck   error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
consensus-client  | Dec 27 17:20:34.817 ERRO Error during execution engine upcheck   error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
consensus-client  | Dec 27 17:20:37.501 WARN Low peer count                          peer_count: 0, service: slot_notifier
consensus-client  | Dec 27 17:20:37.501 INFO Searching for peers                     current_slot: 6633659, head_slot: 0, finalized_epoch: 0, finalized_root: 0x2356…6a75, peers: 0, service: slot_notifier
consensus-client  | Dec 27 17:20:37.501 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
consensus-client  | Dec 27 17:20:38.502 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
consensus-client  | Dec 27 17:20:38.502 ERRO Unable to get transition config         error: Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) }, service: exec
consensus-client  | Dec 27 17:20:38.502 ERRO Not ready for merge                     hint: try updating Lighthouse and/or the execution layer, info: Could not confirm the transition configuration with the execution endpoint: "EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: \"http\", cannot_be_a_base: false, username: \"\", password: None, host: Some(Domain(\"execution\")), port: Some(8551), path: \"/\", query: None, fragment: None }, source: TimedOut }) })", service: slot_notifier
execution-client  | 2022-12-27 17:20:39.0395|Waiting for peers... 33s 
consensus-client  | Dec 27 17:20:39.504 ERRO Error during execution engine upcheck   error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
consensus-client  | Dec 27 17:20:39.818 ERRO Error during execution engine upcheck   error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
execution-client  | 2022-12-27 17:20:40.0371|Waiting for peers... 34s 
execution-client  | 2022-12-27 17:20:41.0370|Waiting for peers... 35s 
execution-client  | 2022-12-27 17:20:41.4691|No incoming messages from Consensus Client. Please make sure that it's working properly 
execution-client  | 2022-12-27 17:20:42.0368|Waiting for peers... 36s 
consensus-client  | Dec 27 17:20:42.501 WARN Low peer count                          peer_count: 0, service: slot_notifier
consensus-client  | Dec 27 17:20:42.501 INFO Searching for peers                     current_slot: 6633660, head_slot: 0, finalized_epoch: 0, finalized_root: 0x2356…6a75, peers: 0, service: slot_notifier
consensus-client  | Dec 27 17:20:42.501 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
execution-client  | 2022-12-27 17:20:43.0373|Waiting for peers... 37s 
consensus-client  | Dec 27 17:20:43.502 ERRO Execution engine call failed            error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
consensus-client  | Dec 27 17:20:43.502 ERRO Unable to get transition config         error: Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }) }, service: exec
consensus-client  | Dec 27 17:20:43.502 ERRO Not ready for merge                     hint: try updating Lighthouse and/or the execution layer, info: Could not confirm the transition configuration with the execution endpoint: "EngineError(Api { error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: \"http\", cannot_be_a_base: false, username: \"\", password: None, host: Some(Domain(\"execution\")), port: Some(8551), path: \"/\", query: None, fragment: None }, source: TimedOut }) })", service: slot_notifier
execution-client  | 2022-12-27 17:20:44.0367|Waiting for peers... 38s 
consensus-client  | Dec 27 17:20:44.503 ERRO Error during execution engine upcheck   error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
consensus-client  | Dec 27 17:20:44.819 ERRO Error during execution engine upcheck   error: Reqwest(reqwest::Error { kind: Request, url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("execution")), port: Some(8551), path: "/", query: None, fragment: None }, source: TimedOut }), service: exec
AntiD2ta commented 1 year ago

I see the authentication scheme is not working. Can you please put here the docker-compose.yml you are using? Did you created the jwtsecret as per instructed here https://docs.gnosischain.com/node/guide/configure-server?

critesjosh commented 1 year ago

here is the docker compose file

version: "3.9"
services:
  execution:
    stop_grace_period: 30s
    container_name: execution-client
    restart: unless-stopped
    image: ${EC_IMAGE_VERSION}
    networks:
    - sedge
    volumes:
    - ${EC_DATA_DIR}:/nethermind/data
    - ${EC_JWT_SECRET_PATH}:/tmp/jwt/jwtsecret
    ports:
    - 30303:30303/tcp
    - 30303:30303/udp
    - 8008:8008
    expose:
    - 8545
    - 8551
    command:
    - --config=${EL_NETWORK}
    - --datadir=/nethermind/data
    - --log=${NETHERMIND_LOG_LEVEL}
    - --Sync.SnapSync=${EC_SNAP_SYNC_ENABLED}
    - --JsonRpc.Enabled=true
    - --JsonRpc.Host=0.0.0.0
    - --JsonRpc.Port=8545
    - --JsonRpc.EnabledModules=${EC_ENABLED_MODULES}
    - --JsonRpc.JwtSecretFile=/tmp/jwt/jwtsecret
    - --JsonRpc.EngineHost=0.0.0.0
    - --JsonRpc.EnginePort=8551
    - --Network.DiscoveryPort=30303
    - --HealthChecks.Enabled=true
    - --Pruning.CacheMb=${NETHERMIND_PRUNING_CACHEMB}
    - --Metrics.Enabled=true
    - --Metrics.ExposePort=8008
    logging:
      driver: json-file
      options:
        max-size: 10m
        max-file: "10"
  consensus:
    stop_grace_period: 30s
    container_name: consensus-client
    restart: unless-stopped
    image: ${CC_IMAGE_VERSION}
    networks:
    - sedge
    volumes:
    - ${CC_DATA_DIR}:/var/lib/lighthouse
    - ${CC_JWT_SECRET_PATH}:/tmp/jwt/jwtsecret
    ports:
    - 9000:9000/tcp
    - 9000:9000/udp
    - 5054:5054/tcp
    expose:
    - 4000
    command:
    - lighthouse
    - bn
    - --disable-upnp
    - --datadir=/var/lib/lighthouse
    - --port=9000
    - --http
    - --http-address=0.0.0.0
    - --http-port=4000
    - --network=${CL_NETWORK}
    - --target-peers=${CC_PEER_COUNT}
    - --boot-nodes=${CL_BOOTNODES}
    - --execution-endpoints=${EC_AUTH_URL}
    - --execution-jwt=/tmp/jwt/jwtsecret
    - --eth1-endpoints=${EC_API_URL}
    - --debug-level=${CC_LOG_LEVEL}
    - --suggested-fee-recipient=${CL_FEE_RECIPIENT}
    - --validator-monitor-auto
    - --subscribe-all-subnets
    - --import-all-attestations
    - --metrics
    - --metrics-port=5054
    - --metrics-address=0.0.0.0
    logging:
      driver: json-file
      options:
        max-size: 10m
        max-file: "10"
  validator-import:
    container_name: validator-import-client
    build:
      context: github.com/NethermindEth/lighthouse-init-validator
      args:
        LH_VERSION: ${VL_IMAGE_VERSION}
        NETWORK: ${CL_NETWORK}
    networks:
    - sedge
    volumes:
    - ${KEYSTORE_DIR}:/keystore
    - ${VL_DATA_DIR}:/data
    logging:
      driver: json-file
      options:
        max-size: 10m
        max-file: "10"
  validator:
    container_name: validator-client
    image: ${VL_IMAGE_VERSION}
    depends_on:
      validator-import:
        condition: service_completed_successfully
    networks:
    - sedge
    ports:
    - 5056:5056
    volumes:
    - ${VL_DATA_DIR}:/data
    command:
    - lighthouse
    - vc
    - --network=${CL_NETWORK}
    - --beacon-nodes=${CC_API_URL}
    - --graffiti=${GRAFFITI}
    - --debug-level=${VL_LOG_LEVEL}
    - --validators-dir=/data/validators
    - --suggested-fee-recipient=${CL_FEE_RECIPIENT}
    - --metrics
    - --metrics-port=5056
    - --metrics-address=0.0.0.0
    logging:
      driver: json-file
      options:
        max-size: 10m
        max-file: "10"
networks:
  sedge:
    name: sedge_network

i did not create the jwt like in the instructions on gnosischain.com, i see the hex secret in the file called jwtsecret in the docker-compose-scripts dir, it was created by sedge automatically

AntiD2ta commented 1 year ago

The configuration is ok. My guess is that error is related to your network interface. There are a couple of checks we can do. The easiest one is to use the --map-all flag with sedge cli to expose all the ports to the localhost. Then you can execute the following command to check if the execution client is accesible:

curl -X POST http://localhost:8545/ --data '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}' -H "Content-Type: application/json"

If the above command works then the execution client is accesible for the consensus client through your network interface.

critesjosh commented 1 year ago

I see a bunch of these errors when i run sedge cli -c lighthouse -e nethermind -n gnosis --map-all

edit: it looks like this may have just taken some time to start up

2022-12-27 19:51:53 -- [ERRO] [GetRequest] request failed. Error: Get "http://192.168.80.2:4000/eth/v1/node/syncing": dial tcp 192.168.80.2:4000: connect: connection refused
2022-12-27 19:51:53 -- [INFO] [GetRequest] Retrying request
2022-12-27 19:51:55 -- [ERRO] [GetRequest] request failed. Error: Get "http://192.168.80.2:4000/eth/v1/node/syncing": dial tcp 192.168.80.2:4000: connect: connection refused
2022-12-27 19:51:55 -- [INFO] [GetRequest] Retrying request
2022-12-27 19:51:57 -- [ERRO] [GetRequest] request failed. Error: Get "http://192.168.80.2:4000/eth/v1/node/syncing": dial tcp 192.168.80.2:4000: connect: connection refused
2022-12-27 19:51:57 -- [INFO] [GetRequest] Retrying request
critesjosh commented 1 year ago

if i stop the log output and restart it with sedge logs. i see nethermind logs, but it doesn't connect to any peers.

the curl command to nethermind works with result

{"jsonrpc":"2.0","result":{"startingBlock":"0x0","currentBlock":"0x0","highestBlock":"0x0"},"id":1}j
critesjosh commented 1 year ago

Did you created the jwtsecret as per instructed here https://docs.gnosischain.com/node/guide/configure-server?

Should I be creating the jwtsecret as outlined in the gnosischain.com docs? or should sedge handle this automatically?