erigontech / erigon

Ethereum implementation on the efficiency frontier https://erigon.gitbook.io
GNU Lesser General Public License v3.0
3.14k stars 1.12k forks source link

Refused connection to downloader in docker-compose #5422

Open mb-gvidon opened 2 years ago

mb-gvidon commented 2 years ago

System information

OS & Version: Ubuntu

Commit hash : 9fb8a190bc4bb436afd3dcf52b3b94baef462743 Latest at the moment of writing the issue

Issue

I found out that containers are working with errors. What is this? Can the errors slow erigon node?

Actual behaviour

erigon_1      | [INFO] [09-18|15:02:53.096] Starting metrics server                  addr=http://0.0.0.0:6060/debug/metrics/prometheus
erigon_1      | [INFO] [09-18|15:02:53.096] Starting pprof server                    cpu="go tool pprof -lines -http=: http://0.0.0.0:6061/debug/pprof/profile?seconds=20" heap="go tool pprof -lines -http=: http://0.0.0.0:6061/debug/pprof/heap"
erigon_1      | [INFO] [09-18|15:02:53.096] Build info                               git_branch=devel git_tag=v2021.10.03-1874-g9fb8a190b-dirty git_commit=9fb8a190bc4bb436afd3dcf52b3b94baef462743
erigon_1      | [INFO] [09-18|15:02:53.096] Starting Erigon on Ethereum mainnet... 
erigon_1      | [INFO] [09-18|15:02:53.102] Maximum peer count                       ETH=20 total=20
erigon_1      | [INFO] [09-18|15:02:53.102] starting HTTP APIs                       APIs=eth,debug,net,trace,web3,erigon,engine
erigon_1      | [INFO] [09-18|15:02:53.102] Set global gas cap                       cap=50000000
erigon_1      | [INFO] [09-18|15:02:53.142] Opening Database                         label=chaindata path=/home/erigon/.local/share/erigon/chaindata
erigon_1      | [INFO] [09-18|15:02:53.164] Initialised chain configuration          config="{ChainID: 1, Homestead: 1150000, DAO: 1920000, DAO Support: true, Tangerine Whistle: 2463000, Spurious Dragon: 2675000, Byzantium: 4370000, Constantinople: 7280000, Petersburg: 7280000, Istanbul: 9069000, Muir Glacier: 9200000, Berlin: 12244000, London: 12965000, Arrow Glacier: 13773000, Gray Glacier: 15050000, Terminal Total Difficulty: 58750000000000000000000, Merge Netsplit: <nil>, Shanghai: <nil>, Cancun: <nil>, Engine: ethash}" genesis=0xd4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
erigon_1      | [INFO] [09-18|15:02:53.164] Effective                                prune_flags= snapshot_flags="--snapshots=true" history.v2=false
erigon_1      | [INFO] [09-18|15:02:53.164] Initialising Ethereum protocol           network=1
erigon_1      | [INFO] [09-18|15:02:53.165] Disk storage enabled for ethash DAGs     dir=/home/erigon/.local/share/erigon/ethash-dags count=2
erigon_1      | [INFO] [09-18|15:02:53.502] Starting private RPC server              on=0.0.0.0:9090
erigon_1      | [INFO] [09-18|15:02:53.502] new subscription to logs established 
erigon_1      | [INFO] [09-18|15:02:53.503] rpc filters: subscribing to Erigon events 
erigon_1      | [INFO] [09-18|15:02:53.503] new subscription to newHeaders established 
erigon_1      | [INFO] [09-18|15:02:53.503] Reading JWT secret                       path=/home/erigon/.local/share/erigon/jwt.hex
erigon_1      | [INFO] [09-18|15:02:53.503] HTTP endpoint opened for Engine API      url=0.0.0.0:8551 ws=true ws.compression=true
erigon_1      | [INFO] [09-18|15:02:53.504] HTTP endpoint opened                     url=localhost:8545 ws=false ws.compression=true grpc=false
erigon_1      | [INFO] [09-18|15:02:53.527] [1/16 Snapshots] Fetching torrent files metadata 
erigon_1      | [EROR] [09-18|15:02:53.527] [1/16 Snapshots] call downloader         err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.24.0.5:9093: connect: connection refused\""
erigon_1      | [WARN] [09-18|15:02:56.575] Served                                   conn=172.24.0.1:33394 method=eth_getLogs reqid=4 t=773.402µs err="block not found 12913913"
erigon_1      | [INFO] [09-18|15:02:57.473] new subscription to newHeaders established 
erigon_1      | [INFO] [09-18|15:03:06.123] [Snapshots] Stat                         blocks=15000k indices=15000k alloc=2.2GB sys=2.4GB
erigon_1      | [INFO] [09-18|15:03:06.124] Timings (slower than 50ms)               Snapshots=12.609s
erigon_1      | [INFO] [09-18|15:03:06.125] RPC Daemon notified of new headers       from=15537393 to=15537394 header sending=6.783µs log sending=330ns
erigon_1      | [INFO] [09-18|15:03:06.125] [2/16 Headers] Waiting for Consensus Layer...

There are some warnings from rpcdaemon too

rpcdaemon_1   | [INFO] [09-18|15:02:54.268] DB schemas compatible                    reader=6.0.0 database=6.0.0
rpcdaemon_1   | [INFO] [09-18|15:02:57.472] [Snapshots] Stat                         blocks=15000k indices=15000k alloc=2.1GB sys=2.3GB
rpcdaemon_1   | [INFO] [09-18|15:02:57.472] rpc filters: subscribing to Erigon events 
rpcdaemon_1   | [WARN] [09-18|15:02:57.472] [Snapshots] reopen                       err="rpc error: code = Internal desc = runtime error: invalid memory address or nil pointer dereference"
rpcdaemon_1   | [INFO] [09-18|15:02:57.473] HTTP endpoint opened                     url=0.0.0.0:8545 ws=true ws.compression=false grpc=false
rpcdaemon_1   | [INFO] [09-18|15:02:57.473] interfaces compatible                    remote_db= client=6.0.0 server=6.0.0
rpcdaemon_1   | [INFO] [09-18|15:02:57.473] interfaces compatible                    remote_service=eth_backend client=3.1.0 server=3.1.0
rpcdaemon_1   | [WARN] [09-18|15:02:57.473] [Snapshots] reopen                       err="rpc error: code = Internal desc = runtime error: invalid memory address or nil pointer dereference"
rpcdaemon_1   | [INFO] [09-18|15:02:57.473] interfaces compatible                    remote_service=mining client=1.0.0 server=1.0.0
rpcdaemon_1   | [INFO] [09-18|15:02:57.474] interfaces compatible                    remote_service=tx_pool client=1.0.0 server=1.0.0
rpcdaemon_1   | [INFO] [09-18|15:03:06.124] [Snapshots] Stat                         blocks=15000k indices=15000k alloc=2.1GB sys=2.3GB

Steps to reproduce the behaviour

There is my docker-compose file

version: '2.2'

# Basic erigon's service
x-erigon-service: &default-erigon-service
  image: thorax/erigon:${TAG:-latest}
  pid: service:erigon # Use erigon's PID namespace. It's required to open Erigon's DB from another process (RPCDaemon local-mode)
  volumes_from: [ erigon ]
  restart: unless-stopped
  mem_swappiness: 0
  user: ${DOCKER_UID:-1000}:${DOCKER_GID:-1000}

services:
  erigon:
    image: thorax/erigon:${TAG:-latest}
    build:
      args:
        UID: ${DOCKER_UID:-1000}
        GID: ${DOCKER_GID:-1000}
      context: .
    command: |
      erigon ${ERIGON_FLAGS-} --private.api.addr=0.0.0.0:9090 --http.api=eth,debug,net,trace,web3,erigon,engine
      --sentry.api.addr=sentry:9091 --downloader.api.addr=downloader:9093 --txpool.disable
      --metrics --metrics.addr=0.0.0.0 --metrics.port=6060 --pprof --pprof.addr=0.0.0.0 --pprof.port=6061
      --datadir=/home/erigon/.local/share/erigon --authrpc.jwtsecret=/home/erigon/.local/share/erigon/jwt.hex --authrpc.addr 0.0.0.0
    ports: [ "8551:8551" ]
    volumes:
      # It's ok to mount sub-dirs of "datadir" to different drives
      - ${XDG_DATA_HOME:-~/.local/share}/erigon:/home/erigon/.local/share/erigon
    restart: unless-stopped
    mem_swappiness: 0

  sentry:
    <<: *default-erigon-service
    command: sentry ${SENTRY_FLAGS-} --sentry.api.addr=0.0.0.0:9091 --datadir=/home/erigon/.local/share/erigon
    ports: [ "30303:30303/tcp", "30303:30303/udp" ]

  downloader:
    <<: *default-erigon-service
    command: downloader ${DOWNLOADER_FLAGS-} --downloader.api.addr=0.0.0.0:9093 --datadir=/home/erigon/.local/share/erigon
    ports: [ "42069:42069/tcp", "42069:42069/udp" ]

  txpool:
    <<: *default-erigon-service
    command: txpool ${TXPOOL_FLAGS-} --private.api.addr=erigon:9090 --txpool.api.addr=0.0.0.0:9094 --sentry.api.addr=sentry:9091 --datadir=/home/erigon/.local/share/erigon

  rpcdaemon:
    <<: *default-erigon-service
    command: |
      rpcdaemon ${RPCDAEMON_FLAGS-} --http.addr=0.0.0.0 --http.vhosts=* --http.corsdomain=* --ws
      --private.api.addr=erigon:9090 --txpool.api.addr=txpool:9094 --datadir=/home/erigon/.local/share/erigon
    ports: [ "8545:8545" ]

  prometheus:
    image: prom/prometheus:v2.37.0
    user: ${DOCKER_UID:-1000}:${DOCKER_GID:-1000} # Uses erigon user from Dockerfile
    command: --log.level=warn --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/prometheus --storage.tsdb.retention.time=150d --web.console.libraries=/usr/share/prometheus/console_libraries --web.console.templates=/usr/share/prometheus/consoles
    ports: [ "9090:9090" ]
    volumes:
      - ${ERIGON_PROMETHEUS_CONFIG:-./cmd/prometheus/prometheus.yml}:/etc/prometheus/prometheus.yml
      - ${XDG_DATA_HOME:-~/.local/share}/erigon-prometheus:/prometheus
    restart: unless-stopped

  grafana:
    image: grafana/grafana:9.0.3
    user: "472:0" # required for grafana version >= 7.3
    ports: [ "3000:3000" ]
    volumes:
      - ${ERIGON_GRAFANA_CONFIG:-./cmd/prometheus/grafana.ini}:/etc/grafana/grafana.ini
      - ./cmd/prometheus/datasources:/etc/grafana/provisioning/datasources
      - ./cmd/prometheus/dashboards:/etc/grafana/provisioning/dashboards
      - ${XDG_DATA_HOME:-~/.local/share}/erigon-grafana:/var/lib/grafana
    restart: unless-stopped
mb-gvidon commented 2 years ago

Seems like node started executing PoS blocks

erigon_1      | [INFO] [09-18|18:00:58.502] [7/16 Execution] Executed blocks         number=15548996 blk/s=4.2 tx/s=818.4 Mgas/s=63.8 gasState=0.32 estimated duration=2h25m34.696s batch=355.2MB alloc=7.4GB sys=9.3GB

This comment helped https://github.com/ledgerwatch/erigon/issues/5370#issuecomment-1247829438

But the process is quite slow (63.8 Mgas/s). Is it okay?

However, the downloader still throws the error

victorelec14 commented 2 years ago

I have the same problem, it is as if erigon does not take the parameter correctly.

Docker compose commands:

command:
      - erigon
      - --chain=mainnet
      #- --datadir=/erigon
      - --http
      - --http.addr=0.0.0.0
      - --http.port=8545
      - --http.vhosts=*
      - --http.corsdomain=*
      - --http.api=eth,net,web3,engine,admin,erigon,debug,trace,txpool
      - --private.api.addr=0.0.0.0:9090
      - --authrpc.jwtsecret=/jwt.hex
      - --authrpc.addr=0.0.0.0
      - --authrpc.port=8551
      - --authrpc.vhosts=*
      - --metrics
      - --metrics.addr=0.0.0.0
      - --metrics.port=6060   
      - --ws
      - --nat=any

response in VM:


root@erigon:~# curl -X POST http://localhost:8551/  -H "Content-Type: application/json"  --data '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}'
curl: (56) Recv failure: Connection reset by peer
root@erigon:~# curl -X POST http://localhost:8545/  -H "Content-Type: application/json"  --data 
'{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}'
curl: (56) Recv failure: Connection reset by peer

thanks!

revitteth commented 2 years ago

Looking quickly this looks like a port mapping issue - --downloader.api.addr=0.0.0.0:9093 this address is used for the downloader, however the ports specified on the downloader container do not include 9093, therefore my idea is that 0.0.0.0:9093 is closed, hence seeing the connection reset by peer.

dreadedhamish commented 1 year ago

@revitteth What was the resolution? As of yesterday the erigon-downloader still doesn't have 9093 open.

dreadedhamish commented 1 year ago

Actually even though the port isn't explicitly opened, or appear open from within docker-exec - after I resolved other issues the downloader worked just fine.