erigontech / erigon

Ethereum implementation on the efficiency frontier https://erigon.gitbook.io
GNU Lesser General Public License v3.0
3.13k stars 1.11k forks source link

Erigon Start up Error due not being able to connect to checkpoint syncs #6329

Closed enriquemanuel closed 1 year ago

enriquemanuel commented 1 year ago

System information

Upgraded from: 2.28.1 Upgraded to: 2.30.0

OS: Debian Erigon is in Docker

Expected behaviour

Actual behaviour

Steps to reproduce the behaviour

have a network gateway that blocks connectivity to non approved sites and block the list there and Erigon wont start.

Before in 2.28.1 Erigon would start with

" [INFO] [12-15|21:11:57.539] [1/16 Snapshots] Fetching torrent files metadata "
" [INFO] [12-15|21:11:59.072] [txpool] Started "
" [INFO] [12-15|21:11:59.703] [Snapshots] Blocks Stat                  blocks=15000k indices=15000k alloc=2.5GB sys=2.7GB"
" [INFO] [12-15|21:11:59.703] [2/16 Headers] Waiting for Consensus Layer... "

and after the upgrade with the same set of CLI args it fails with

" [INFO] [12-15|18:50:55.447] new subscription to logs established "
" [INFO] [12-15|18:50:55.461] Sentinel started                         enr=enr:-Ly4QFl6bRBBm06Vt3Arr024NBG5GzKUkBxjF8Yf7Dp1-t5XOhzhPVRgWgmShj1LVnUfLtJ8zHdDdqIrLpm6xTgKm_kBh2F0dG5ldHOIAAAAAAAAAACEZXRoMpBKJsWLAgAAAP__________gmlkgnY0gmlwhH8AAAGJc2VjcDI1NmsxoQJ_cM7XDTHvkn82gT9ktr7t4umohXHWDwiQXBGR-vm1OIhzeW5jbmV0cwCDdGNwgg-hg3VkcIIPoA"
" [INFO] [12-15|18:50:55.461] [Checkpoint Sync] Requesting beacon state uri=https://mainnet.checkpoint.sigp.io/eth/v2/debug/beacon/states/finalized"
" [EROR] [12-15|18:50:55.466] Erigon startup                           err=\"Get \\\"https://mainnet.checkpoint.sigp.io/eth/v2/debug/beacon/states/finalized\\\": EOF\""

Backtrace

The issue was introduced when this change was added https://github.com/ledgerwatch/erigon/blob/5c3245d4e034606732573044a59ce793be1ceeea/cl/clparams/config.go#L189-L209 in PR https://github.com/ledgerwatch/erigon/pull/5761

Command Line Arguments:

--http --snapshots=false --http.addr=0.0.0.0 --http.port=8546 --authrpc.port=8551 --http.vhosts=* --http.corsdomain=* --datadir=/var/lib/erigon/.ethereum --metrics --healthcheck --authrpc.jwtsecret=/secret/path/to/jwt/token --http.api=eth,net,web3,debug,admin,engine,erigon,trace --log.console.verbosity=info --ws --chain=mainnet --maxpeers 100

More info

In the releases https://github.com/ledgerwatch/erigon/releases it doesn't state that it now needs to connect to those external sites by default nor it mentions the PR 5761 to be a requirement. Can that be bypassed? For sanity, any domain in that list isn't reachable from my production hosts unless I add a firewall rule, which is problematic since that list is not exhaustive and can change without any announcement; this alters the default behavior we had before but wasn't announced.

reverting to 2.28.1 allows us to start and run

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 40 days with no activity. Remove stale label or comment, or this will be closed in 7 days.

github-actions[bot] commented 1 year ago

This issue was closed because it has been stalled for 7 days with no activity.

bioharz commented 1 year ago

please open this

sbrass commented 1 year ago

I would like to add my observations on this kind of error message, in my case lead by:

"https://sync.invis.tools/eth/v2/debug/beacon/states/finalized": dial tcp: lookup sync.invis.tools: device or resource busy

OS: Ubuntu 22.04.2 LTS (Jammy Jellyfish) Architecture: aarch64 Official images:

As the error message already states, the underlying issue isn't directly related to Erigon itself. I can just guess that it may be an issue with the underlying Go libraries?

From my side, I could solve the issue by buildinng the container for all versions myself since v2.32.1 using:

docker build -f Dockerfile -t thorax/erigon:v<current version> .

and trying out the image, all with Go version 1.19.7. The error message disappers in all cases. Using the image from Docker Hub produces the above error message.

zhiming137 commented 1 year ago

It seems like you're facing a similar issue related to Go DNS resolution. The problem can be solved by setting an environment variable.

Here's the command you can use to set the GODEBUG environment variable:

GODEBUG="netdns=go" ./bin/erigon --config ./config.toml

This will set the netdns value to go, which will use Go's implementation of DNS resolution instead of the system's implementation. This should help resolve the issue you're facing.