hyperledger / besu

An enterprise-grade Java-based, Apache 2.0 licensed Ethereum client https://wiki.hyperledger.org/display/besu
https://www.hyperledger.org/projects/besu
Apache License 2.0
1.51k stars 826 forks source link

eth_syncing endpoint returns false when execution client is not synced #7338

Open eenagy opened 3 months ago

eenagy commented 3 months ago

Description

As a node operator, I want the eth_syncing endpoint to be consistent with the specification so that I can write integration tests against the client.

Acceptance Criteria

Return the non-false value when the besu has not been synced yet (not started/stalled/etc.). Return false only when in sync with the network. (Based on the definition of execution API spec) Edit: I have confused the execution and beacon API spec, false does not indicate synced status. However, other clients do return non-false when not synced.

Steps to Reproduce (Bug)

  1. Start besu
    besu  --rpc-http-enabled --data-path=$HOME/.run-a-node/sepolia --network=SEPOLIA --engine-jwt-secret=$HOME/.run-a-node/sepolia/jwt.hex --rpc-http-api=ETH --rpc-http-port=8545
  2. Start any of these clients of lodestar/nimbus-eth2/teku Starting lodestar
    lodestar beacon  --rest --checkpointSyncUrl https://beaconstate-sepolia.chainsafe.io --rest.port 5052 --execution.urls http://localhost:8551 --jwtSecret $HOME/.run-a-node/sepolia/jwt.hex --dataDir $HOME/.run-a-node/sepolia --network sepolia

    Starting nimbus-eth2

    
    curl -o $HOME/.run-a-node/sepolia/state.finalized.ssz   -H 'Accept: application/octet-stream' https://beaconstate-sepolia.chainsafe.io/eth/v2/debug/beacon/states/finalized

nimbus_beacon_node --network=sepolia --data-dir=$HOME/.run-a-node/sepolia --web3-url=http://localhost:8551 --jwt-secret=$HOME/.run-a-node/sepolia/jwt.hex --external-beacon-api-url=https://beaconstate-sepolia.chainsafe.io --finalized-checkpoint-state=$HOME/.run-a-node/sepolia/state.finalized.ssz --rest=true --rest-port=5052

Starting teku

teku --checkpoint-sync-url https://beaconstate-sepolia.chainsafe.io --network sepolia --ee-endpoint http://localhost:8551 --ee-jwt-secret-file $HOME/.run-a-node/sepolia/jwt.hex --data-path $HOME/.run-a-node/sepolia --rest-api-enabled true --rest-api-port 5052

3. Wait a couple of minutes (4-5 minutes)
4. Check endpoint result 

curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":0}' -H Content-Type: application/json http://localhost:8545 {"jsonrpc":"2.0","id":0,"result":false}


**Alternate behavior, where it returns non-false**

1. Start besu 

besu --rpc-http-enabled --data-path=$HOME/.run-a-node/sepolia --network=SEPOLIA --engine-jwt-secret=$HOME/.run-a-node/sepolia/jwt.hex --rpc-http-api=ETH --rpc-http-port=8545

2. Start any of these clients of lighthouse/prysm
Starting lighthouse

lighthouse beacon_node --disable-deposit-contract-sync --http --checkpoint-sync-url=https://beaconstate-sepolia.chainsafe.io --datadir=$HOME/.run-a-node/sepolia --execution-endpoint=http://localhost:8551 --execution-jwt=$HOME/.run-a-node/sepolia/jwt.hex --http-allow-origin='*' --http-port=5052 --network=sepolia

Starting prysm

beacon-chain --accept-terms-of-use --sepolia --datadir=$HOME/.run-a-node/sepolia --checkpoint-sync-url=https://beaconstate-sepolia.chainsafe.io --disable-grpc-gateway=false --execution-endpoint=http://localhost:8551 --genesis-beacon-api-url=https://beaconstate.info --grpc-gateway-port=5052 --http-modules=eth --jwt-secret=$HOME/.run-a-node/sepolia/jwt.hex

3. Wait a couple of minutes (4-5 minutes)
4. Check endpoint result 

curl -s -X POST --data '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":0}' -H Content-Type: application/json http://localhost:8545 {"jsonrpc":"2.0","id":0,"result":{"startingBlock":"0x0","currentBlock":"0x190","highestBlock":"0x6087a7"}}


** Expected behavior: ** Return a non-false value. The spec needs to be clarified on what it should return when sync is not started, but it does indicate when to return a false value. 

**Actual behavior:** The current behavior is pretty random; it's inconsistent between CL clients. 

**Frequency:** Always.

### Logs (if a bug)
It's reproducible; I don't want to include 10 logs for brevity. If it is needed, let me know; I can provide it.  

### Versions (Add all that apply)
* Software version: `besu/v24.6.0/linux-x86_64/oracle-java-21`
* Java version: `java 21.0.2 2024-01-16 LTS
Java(TM) SE Runtime Environment (build 21.0.2+13-LTS-58)
Java HotSpot(TM) 64-Bit Server VM (build 21.0.2+13-LTS-58, mixed mode, sharing)`
* OS Name & Version: 

PRETTY_NAME="Debian GNU/Linux 12 (bookworm)" NAME="Debian GNU/Linux" VERSION_ID="12" VERSION="12 (bookworm)" VERSION_CODENAME=bookworm ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/"


* Kernel Version: `Linux debian 6.1.0-22-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.94-1 (2024-06-21) x86_64 GNU/Linux`
* Consensus Client & Version if using Proof of Stake: latest consensus clients

### Additional information 

geth/nethermind/erigon will always return non-false for not synced clients against the same settings provided for (lighthouse/lodestar/nimbus-eth2/prysm/teku). `reth` has a similar issue returning false when not synced on a network where peers are slow to show up. 
Matilda-Clerke commented 2 months ago

Hi @eenagy, I'm not able to reproduce this with the latest code on Besu/Teku main branches. Can you provide the logs from when you saw this issue? Just the Besu and Teku logs for now.

Matilda-Clerke commented 2 months ago

This morning, I tried again with Besu 24.6.0 and found that syncing appeared to stall, after which the eth_syncing endpoint would return false.

2024-08-21 10:15:23.182+10:00 | EthScheduler-Services-42 (importBlock) | INFO  | ImportBlocksStep | Block import progress: 3643797 of 6535007 (55%), Peer count: 1
2024-08-21 10:16:54.904+10:00 | EthScheduler-Services-1 (checkNewPivotBlock-Account) | INFO  | WaitForPeersTask | Waiting for 1 total peers to connect. 0 peers currently connected.
2024-08-21 10:17:26.836+10:00 | EthScheduler-Timer-0 | INFO  | SyncTargetManager | Unable to find sync target. Currently checking 0 peers for usefulness.
corn-potage commented 1 month ago

Hi there. I'm currently implementing sync status in NiceNode, and I'm also seeing the difference in eth_syncing response behavior compared to other clients, where they usually return an object in the beginning stages of syncing, and false only when the node is considered fully synchronized with the network.

Geth

{
    "currentBlock": "0x0",
    "healedBytecodeBytes": "0x0",
    "healedBytecodes": "0x0",
    "healedTrienodeBytes": "0x0",
    "healedTrienodes": "0x0",
    "healingBytecode": "0x0",
    "healingTrienodes": "0x0",
    "highestBlock": "0x0",
    "startingBlock": "0x0",
    "syncedAccountBytes": "0x0",
    "syncedAccounts": "0x0",
    "syncedBytecodeBytes": "0x0",
    "syncedBytecodes": "0x0",
    "syncedStorage": "0x0",
    "syncedStorageBytes": "0x0",
    "txIndexFinishedBlocks": "0x0",
    "txIndexRemainingBlocks": "0x1"
}

Nethermind:

{
     "startingBlock": "0x0",
     "currentBlock": "0x0",
     "highestBlock": "0x0"
}