maticnetwork / bor

Official repository for the Polygon Blockchain
https://polygon.technology/
GNU Lesser General Public License v3.0
1k stars 491 forks source link

bor can not sync #1283

Closed EronTo closed 2 months ago

EronTo commented 3 months ago

I'm facing a persistent issue with my BOR client failing to sync correctly. The following log messages keep appearing repeatedly:

Jun 30 06:08:01 pol bor[123821]: WARN [06-30|06:08:01.890] unable to handle whitelist milestone     err="missing blocks"
Jun 30 06:08:13 pol bor[123821]: INFO [06-30|06:08:13.890] Got new milestone from heimdall          start=58,777,986 end=58,778,001 hash=0xc37d0a4fbea55fa63c2926a6d55942581079f638893aac185f0d802fdeb9cf07
Jun 30 06:08:13 pol bor[123821]: WARN [06-30|06:08:13.890] unable to handle whitelist milestone     err="missing blocks"
Jun 30 06:08:25 pol bor[123821]: INFO [06-30|06:08:25.891] Got new milestone from heimdall          start=58,777,986 end=58,778,001 hash=0xc37d0a4fbea55fa63c2926a6d55942581079f638893aac185f0d802fdeb9cf07
Jun 30 06:08:25 pol bor[123821]: WARN [06-30|06:08:25.891] unable to handle whitelist milestone     err="missing blocks"
Jun 30 06:08:37 pol bor[123821]: INFO [06-30|06:08:37.890] Got new milestone from heimdall          start=58,777,986 end=58,778,001 hash=0xc37d0a4fbea55fa63c2926a6d55942581079f638893aac185f0d802fdeb9cf07

System Details:

BOR version: 1.3.3 Heimdall version: 1.0.7 When I run the command:

curl -s -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}' http://127.0.0.1:8545 I get the following response:

{"jsonrpc":"2.0","id":1,"result":"0x10"}

Here is my config.toml configuration:

chain = "mainnet"
# identity = "node_name"
# verbosity = 3
# vmdebug = false
datadir = "/home/ubuntu/pol"
# ancient = ""
# "db.engine" = "leveldb"
# keystore = ""
# "rpc.batchlimit" = 100
# "rpc.returndatalimit" = 10000
syncmode = "full"
# gcmode = "full"
# snapshot = true
# ethstats = ""
# devfakeauthor = false

# ["eth.requiredblocks"]

# [log]
#    vmodule = ""
#    json = false
#    backtrace = ""
#    debug = true
#    enable-block-tracking = false

[p2p]
    maxpeers = 400
    port = 30303
    # maxpendpeers = 50
    # bind = "0.0.0.0"
    # nodiscover = false
    # nat = "any"
    # netrestrict = ""
    # nodekey = ""
    # nodekeyhex = ""
    # txarrivalwait = "500ms"
    [p2p.discovery]
        # v4disc = true
        # v5disc = false
        bootnodes = ["enode://b8f1cc9c5d4403703fbf377116469667d2b1823c0daf16b7250aa576bacf399e42c3930ccfcb02c5df6879565a2b8931335565f0e8d3f8e72385ecf4a4bf160a@3.36.224.80:30303", "enode://8729e0c825f3d9cad382555f3e46dcff21af323e89025a0e6312df541f4a9e73abfa562d64906f5e59c51fe6f0501b3e61b07979606c56329c020ed739910759@54.194.245.5:30303"]
        # bootnodesv4 = []
        # bootnodesv5 = []
        # static-nodes = []
        # dns = []

# [heimdall]
#    url = "http://localhost:1317"
#    "bor.without" = false
#    grpc-address = ""

[txpool]
    nolocals = true
    pricelimit = 30000000000
    accountslots = 16
    globalslots = 32768
    accountqueue = 16
    globalqueue = 32768
    lifetime = "1h30m0s"
    # locals = []
    # journal = ""
    # rejournal = "1h0m0s"
    # pricebump = 10

[miner]
    gaslimit = 30000000
    gasprice = "30000000000"
    # mine = false
    # etherbase = ""
    # extradata = ""
    # recommit = "2m5s"
    # commitinterrupt = true

[jsonrpc]
    ipcpath = "/var/lib/bor/bor.ipc"
    # ipcdisable = false
    # gascap = 50000000
    # evmtimeout = "5s"
    # txfeecap = 5.0
    # allow-unprotected-txs = false
    # enabledeprecatedpersonal = false
    [jsonrpc.http]
        enabled = true
        port = 8545
        host = "0.0.0.0"
        api = ["eth", "net", "web3", "txpool", "bor"]
        vhosts = ["*"]
        corsdomain = ["*"]
        # prefix = "/"
        # ep-size = 40
        # ep-requesttimeout = "0s"
    # [jsonrpc.ws]
    #     enabled = false
    #     port = 8546
    #     prefix = ""
    #     host = "localhost"
    #     api = ["web3", "net"]
    #     origins = ["*"]
    #     ep-size = 40
    #     ep-requesttimeout = "0s"
    # [jsonrpc.graphql]
    #     enabled = false
    #     port = 0
    #     prefix = ""
    #     host = ""
    #     vhosts = ["*"]
    #     corsdomain = ["*"]
    # [jsonrpc.auth]
    #     jwtsecret = ""
    #     addr = "localhost"
    #     port = 8551
    #     vhosts = ["localhost"]
    # [jsonrpc.timeouts]
    #     read = "10s"
    #     write = "30s"
    #     idle = "2m0s"

[gpo]
    # blocks = 20
    # percentile = 60
    # maxheaderhistory = 1024
    # maxblockhistory = 1024
    # maxprice = "5000000000000"
    ignoreprice = "30000000000"

[telemetry]
    metrics = true
    # expensive = false
    # prometheus-addr = ""
    # opencollector-endpoint = ""
    # [telemetry.influx]
    #     influxdb = false
    #     endpoint = ""
    #     database = ""
    #     username = ""
    #     password = ""
    #     influxdbv2 = false
    #     token = ""
    #     bucket = ""
    #     organization = ""
    # [telemetry.influx.tags]

[cache]
    cache = 5096
    # gc = 25
    # snapshot = 10
    # database = 50
    # trie = 15
    # noprefetch = false
    # preimages = false
    # txlookuplimit = 2350000
    # blocklogs = 32
    # timeout = "1h0m0s"
    # fdlimit = 0

# [accounts]
#    unlock = []
#    password = ""
#    allow-insecure-unlock = false
#    lightkdf = false
#    disable-bor-wallet = false

# [grpc]
#    addr = ":3131"

# [developer]
#    dev = false
#    period = 0
#    gaslimit = 11500000

# [pprof]
#   pprof = false
#   port = 6060
#   addr = "127.0.0.1"
#   memprofilerate = 524288
#   blockprofilerate = 0

Could anyone help me diagnose and resolve this issue? Any guidance or suggestions would be greatly appreciated!

OldBorrow1488 commented 3 months ago

The same parameters and config, and the same problem

tennnessse commented 3 months ago

i have same problem

feld commented 3 months ago

^^ do not follow that scam link

Can we get survey of where the servers are? Mine hosted with AWS have the syncing issue too. I've been struggling with this since late March when 1.3.0 was released as we used to have no problems getting 200 peers back then

EronTo commented 3 months ago

^^ do not follow that scam link

Can we get survey of where the servers are? Mine hosted with AWS have the syncing issue too. I've been struggling with this since late March when 1.3.0 was released as we used to have no problems getting 200 peers back then

I never had 200 nodes connected, the most I ever had was around 40, and now I have temporarily started syncing normally by adding peers

hdiass commented 2 months ago

Problem still exists on every bor node. When you guys can fix this issue ?

piccadil commented 2 months ago

The same issues on the same versions of the bor and heimdall.

darkhorse-spb commented 2 months ago

same problem here

valamidev commented 2 months ago

We are still using 1.2.X, to avoid this issue.

OldBorrow1488 commented 2 months ago

Мы по-прежнему используем 1.2.X, чтобы избежать этой проблемы.

what versions do you use to solve this problem, can you tell us more?

valamidev commented 2 months ago

Мы по-прежнему используем 1.2.X, чтобы избежать этой проблемы.

what versions do you use to solve this problem, can you tell us more?

1.2.8 Bor + 1.0.4 Heimdall and used the used the last Snapshot provided by Polygon team. Took about ~3 weeks to sync.

piccadil commented 2 months ago

I downgraded a node to 1.2.8 Bor + 1.0.4 Heimdall, and started to sync for about 10-15 minutes, then I got the same error err="missing blocks"

valamidev commented 2 months ago

I downgraded a node to 1.2.8 Bor + 1.0.4 Heimdall, and started to sync for about 10-15 minutes, then I got the same error err="missing blocks"

Are you working from snapshot or sync from 0?

valamidev commented 2 months ago

This err="missing blocks" remind me a common issue with other chains, once your Node was killed or not gracefully shutdown your chaindata getting corrupted there is no any other sign for that unless it will never sync anymore.

piccadil commented 2 months ago

I'm working from a snapshot. It was on version 1.3.3, synced for a couple of weeks, then stopped to sync. After restart, it syncs for 5-10 minutes, and then I get this error.

ericcheng201168 commented 2 months ago

I'm working from a snapshot. It was on version 1.3.3, synced for a couple of weeks, then stopped to sync. After restart, it syncs for 5-10 minutes, and then I get this error.

I got the same issue

Brindrajsinh-Chauhan commented 2 months ago

Facing the same issue with the same version and config. Any solution/fixes?

sduchesneau commented 2 months ago

This fixed it for me (seems these 3 issues are duplicate) https://github.com/maticnetwork/bor/issues/1239#issuecomment-2235835196

my node 1.3.3 was stopping synching as soon as is reached the HEAD (it seems it could only sync historically...) then, deleting nodekey, nodes/ and starting with bootnodes ( I took a few bootnodes from a node running on 1.2.8 that was working).

OldBorrow1488 commented 2 months ago

https://github.com/maticnetwork/bor/issues/1239#issuecomment-2247507683 fix

anshalshukla commented 2 months ago

Recent release seems to have fixed the above mentioned issues. Feel free to reopen the issue if it still persists.