erigontech / erigon

Ethereum implementation on the efficiency frontier https://erigon.gitbook.io
GNU Lesser General Public License v3.0
3.1k stars 1.09k forks source link

OtterSync Not Downloading Header-Chain Looks like Stuck #11578

Closed munsayac13 closed 2 weeks ago

munsayac13 commented 1 month ago

System information

Erigon version: ./erigon --version erigon version 3.00.0-alpha2-6124a58f

OS & Version: Windows/Linux/OSX Linux in Kubernetes

Commit hash:

Erigon Command (with flags/config): spec: containers:

Consensus Layer:

Consensus Layer Command (with flags/config): "--internalcl=false"

Chain/Network: mainnet

Expected behaviour

I expect the header-chain and snapshots starts downloading

Actual behaviour

Waiting for Torrents metadata and the logs shows below [INFO] [08-12|20:53:56.481] [OtterSync] Starting Ottersync [INFO] [08-12|20:53:56.481] [1/6 OtterSync] Waiting for torrents metadata: 0/134 [INFO] [08-12|20:53:56.482] [1/6 OtterSync] downloading header-chain progress="0.00% 0B/0B" time-left=999hrs:99m total-time=20s download=0B/s flush=0B/s hash=0B/s complete=0B/s upload=0B/s peers=0 files=134 metadata=0/134 connections=0 alloc=53.1MB sys=91.0MB [INFO] [08-12|20:54:16.482] [1/6 OtterSync] Waiting for torrents metadata: 0/134 [INFO] [08-12|20:54:16.482] [1/6 OtterSync] downloading header-chain progress="0.00% 0B/0B" time-left=999hrs:99m total-time=40s download=0B/s flush=0B/s hash=0B/s complete=0B/s upload=0B/s peers=0 files=134 metadata=0/134 connections=0 alloc=37.4MB sys=100.9MB [INFO] [08-12|20:54:36.655] [1/6 OtterSync] Waiting for torrents metadata: 0/134 [INFO] [08-12|20:54:36.655] [1/6 OtterSync] downloading header-chain progress="0.00% 0B/0B" time-left=999hrs:99m total-time=1m0s download=0B/s flush=0B/s hash=0B/s complete=0B/s upload=0B/s peers=0 files=134 metadata=0/134 connections=0 alloc=31.8MB sys=102.2MB [INFO] [08-12|20:54:56.481] [1/6 OtterSync] Waiting for torrents metadata: 0/134 [INFO] [08-12|20:54:56.482] [1/6 OtterSync] downloading header-chain progress="0.00% 0B/0B" time-left=999hrs:99m total-time=1m20s download=0B/s flush=0B/s hash=0B/s complete=0B/s upload=0B/s peers=0 files=134 metadata=0/134 connections=0 alloc=32.0MB sys=102.3MB [INFO] [08-12|20:55:16.482] [1/6 OtterSync] Waiting for torrents metadata: 0/134 [INFO] [08-12|20:55:16.482] [1/6 OtterSync] downloading header-chain progress="0.00% 0B/0B" time-left=999hrs:99m total-time=1m40s download=0B/s flush=0B/s hash=0B/s complete=0B/s upload=0B/s peers=0 files=134 metadata=0/134 connections=0 alloc=48.7MB sys=102.6MB [INFO] [08-12|20:55:36.482] [1/6 OtterSync] Waiting for torrents metadata: 0/134 [INFO] [08-12|20:55:36.482] [1/6 OtterSync] downloading header-chain progress="0.00% 0B/0B" time-left=999hrs:99m total-time=2m0s download=0B/s flush=0B/s hash=0B/s complete=0B/s upload=0B/s peers=0 files=134 metadata=0/134 connections=0 alloc=50.2MB sys=103.1MB [INFO] [08-12|20:55:56.482] [1/6 OtterSync] Waiting for torrents metadata: 0/134 [INFO] [08-12|20:55:56.482] [1/6 OtterSync] downloading header-chain progress="0.00% 0B/0B" time-left=999hrs:99m total-time=2m20s download=0B/s flush=0B/s hash=0B/s complete=0B/s upload=0B/s peers=0 files=134 metadata=0/134 connections=0 alloc=58.5MB sys=103.3MB [INFO] [08-12|20:56:16.481] [1/6 OtterSync] Waiting for torrents metadata: 0/134

Steps to reproduce the behaviour

Use erigon flags specified above

Backtrace

[backtrace]
Giulio2002 commented 1 month ago

is your network well configured? can you run with --log.console.verbosity=4?

munsayac13 commented 1 month ago

Yeah... I was just using 2.60.2... Sync works fine. We just wanted to use 3.0.0 image. Here is what im seeing after verbosity got added.

[INFO] [08-14|22:13:59.578] [1/6 OtterSync] Waiting for torrents metadata: 0/134 [INFO] [08-14|22:13:59.578] [1/6 OtterSync] downloading header-chain progress="0.00% 0B/0B" time-left=999hrs:99m total-time=1m0s download=0B/s flush=0B/s hash=0B/s complete=0B/s upload=0B/s peers=0 files=134 metadata=0/134 connections=0 alloc=57.9MB sys=106.6MB [DBUG] [08-14|22:14:00.330] [txpool.fetch] Handling incoming message reqID=POOLED_TRANSACTIONS_66 err="txn rlp too big" [DBUG] [08-14|22:14:00.352] [txpool.fetch] Handling incoming message reqID=POOLED_TRANSACTIONS_66 err="txn rlp too big" [DBUG] [08-14|22:14:05.303] [txpool.fetch] Handling incoming message reqID=POOLED_TRANSACTIONS_66 err="txn rlp too big" [DBUG] [08-14|22:14:07.916] [txpool] Commit written_kb=0 in=2.255024ms [DBUG] [08-14|22:14:09.444] [txpool.fetch] Handling incoming message reqID=POOLED_TRANSACTIONS_66 err="txn rlp too big" [DBUG] [08-14|22:14:14.383] [txpool.fetch] Handling incoming message reqID=POOLED_TRANSACTIONS_66 err="txn rlp too big" [DBUG] [08-14|22:14:18.947] [txpool.fetch] Handling incoming message reqID=POOLED_TRANSACTIONS_66 err="txn rlp too big" [INFO] [08-14|22:14:19.578] [1/6 OtterSync] Waiting for torrents metadata: 0/134 [INFO] [08-14|22:14:19.579] [1/6 OtterSync] downloading header-chain progress="0.00% 0B/0B" time-left=999hrs:99m total-time=1m20s download=0B/s flush=0B/s hash=0B/s complete=0B/s upload=0B/s peers=0 files=134 metadata=0/134 connections=0 alloc=33.2MB sys=111.9MB [DBUG] [08-14|22:14:23.625] [txpool] Commit written_kb=0 in=2.110401ms [DBUG] [08-14|22:14:27.469] [txpool.fetch] Handling incoming message reqID=POOLED_TRANSACTIONS_66 err="txn rlp too big"

Giulio2002 commented 1 month ago

can you show more logs (from start to 1000 lines circa)? this does not tell me much. maybe you can give me a pastebin? At first glance, this looks like some ports are not being exposed to the public from the cluster

munsayac13 commented 4 weeks ago

It appears that the issue is not with erigon but with downloader. Downloader does not download anything nor seeding. Also i dont believe it has to do with our network. Like I mentioned, we still using 2.60.2 infact i just upgraded our apps to use 2.60.6 and its working fine. What I have used for 3.0.0-alpha2 is pretty much the same as I have in 2.60.6.

Here is my statefulset config for downloader.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: erigon-downloader
  namespace: default
spec:
.......
  template:
......
    spec:
      containers:
      - command:
        - downloader
        - --datadir=/home/erigon/.local/share/erigon3
        - --pprof
        - --metrics
        - --pprof.addr=0.0.0.0
        - --metrics.addr=0.0.0.0
        - --metrics.port=6063
        - --pprof.port=16063
        - --torrent.port=42068
        - --torrent.download.slots=16
        - --torrent.download.rate=500mb
        - --torrent.upload.rate=50mb
        - --chain=mainnet
        - --log.console.verbosity=4
        image: thorax/erigon:v3.0.0-alpha2
        imagePullPolicy: Always
        name: erigon-downloader
        ports:
        - containerPort: 42068
          name: torrent
          protocol: UDP
        resources:
          limits:
            cpu: 2048m
            memory: 8Gi
          requests:
            cpu: 64m
            memory: 512Mi
        securityContext:
          allowPrivilegeEscalation: true
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /home/erigon/.local/share/erigon3
          name: data
      dnsPolicy: ClusterFirst
      initContainers:
      - command:
        - sh
        - -c
        - chown 1000 -R /home/erigon/.local/share/erigon3
        image: busybox
        imagePullPolicy: Always
        name: chown-datadir
        resources: {}
        securityContext:
          capabilities:
            add:
            - CHOWN
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /home/erigon/.local/share/erigon3
          name: data

And here is the logs for downloader

[INFO] [08-15|19:32:36.821] logging to file system                   log dir=/home/erigon/.local/share/erigon3/logs file prefix=downloader log level=info json=false
[INFO] [08-15|19:32:36.821] Enabling metrics export to prometheus    path=http://0.0.0.0:6063/debug/metrics/prometheus
[INFO] [08-15|19:32:36.821] Starting pprof server                    cpu="go tool pprof -lines -http=: http://0.0.0.0:16063/debug/pprof/profile?seconds=20" heap="go tool pprof -lines -http=: http://0.0.0.0:16063/debug/pprof/heap"
[INFO] [08-15|19:32:36.821] Build info                               git_branch=heads/v3.0.0-alpha2 git_tag=v3.0.0-alpha2-dirty git_commit=6124a58f7f6560641e25181e05e74b3bbfdaa95a
[INFO] [08-15|19:32:36.828] [db] open                                label=chaindata sizeLimit=4882812504KB pageSize=8192
[INFO] [08-15|19:32:36.829] [snapshots] cli flags                    chain=mainnet addr=127.0.0.1:9093 datadir=/home/erigon/.local/share/erigon3 ipv6-enabled=true ipv4-enabled=true download.rate=500MB upload.rate=50MB webseed=
[DBUG] [08-15|19:32:36.858] [db] open                                label=downloader sizeLimit=16GB pageSize=4096
[INFO] [08-15|19:32:36.865] [snapshots] Start bittorrent server      my_peer_id=2d4754303030332d86a823c9ff45c119cfff7685
[INFO] [08-15|19:32:36.865] Started gRPC server                      on=127.0.0.1:9093
[INFO] [08-15|19:32:56.866] [snapshots] Downloading                  progress="0.00% 0B/0B" downloading=0 download=0B/s upload=0B/s peers=0 conns=0 files=0 alloc=18.7MB sys=36.1MB
[INFO] [08-15|19:33:16.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=21.8MB sys=36.3MB
[INFO] [08-15|19:33:36.865] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=22.4MB sys=36.3MB
[INFO] [08-15|19:33:56.865] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=22.9MB sys=36.3MB
[INFO] [08-15|19:34:16.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=18.6MB sys=45.3MB
[INFO] [08-15|19:34:36.865] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=19.1MB sys=45.3MB
[INFO] [08-15|19:34:56.865] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=19.6MB sys=45.3MB
[INFO] [08-15|19:35:16.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=22.5MB sys=45.3MB
[INFO] [08-15|19:35:36.827] [mem] memory stats                       Rss=271.2MB Size=0B Pss=271.2MB SharedClean=4.0KB SharedDirty=0B PrivateClean=27.7MB PrivateDirty=243.5MB Referenced=271.2MB Anonymous=243.5MB Swap=0B alloc=23.1MB sys=45.3MB
[INFO] [08-15|19:35:36.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=23.1MB sys=45.3MB
[INFO] [08-15|19:35:56.865] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=23.6MB sys=45.3MB
[INFO] [08-15|19:36:16.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=18.6MB sys=45.8MB
[INFO] [08-15|19:36:36.865] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=19.1MB sys=45.8MB
[INFO] [08-15|19:36:56.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=19.6MB sys=45.8MB
[INFO] [08-15|19:37:16.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=21.8MB sys=45.8MB
[INFO] [08-15|19:37:36.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=22.3MB sys=45.8MB
[INFO] [08-15|19:37:56.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=22.8MB sys=45.8MB
[INFO] [08-15|19:38:16.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=18.4MB sys=46.1MB
[INFO] [08-15|19:38:36.828] [mem] memory stats                       Rss=272.2MB Size=0B Pss=272.2MB SharedClean=4.0KB SharedDirty=0B PrivateClean=27.7MB PrivateDirty=244.5MB Referenced=272.2MB Anonymous=244.5MB Swap=0B alloc=19.0MB sys=46.1MB
[INFO] [08-15|19:38:36.865] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=19.0MB sys=46.1MB
[INFO] [08-15|19:38:56.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=19.5MB sys=46.1MB
[INFO] [08-15|19:39:16.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=23.3MB sys=50.1MB
[INFO] [08-15|19:39:36.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=23.8MB sys=50.1MB
[INFO] [08-15|19:39:56.866] [snapshots] Seeding                      up=0B/s peers=0 conns=0 files=0 alloc=24.4MB sys=50.1MB
Giulio2002 commented 4 weeks ago

ooooh, can you try running erigon with downloader embedded? I think that might be part of the issue, maybe we lost support to it accidentally

munsayac13 commented 4 weeks ago

Same issue.... i have erigon and downloader under new statefulset

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: erigon
  namespace: default
spec:
.......
  template:
......
    spec:
      containers:
      - command:
        - erigon
        - --http=false
        - --private.api.addr=0.0.0.0:9090
        - --authrpc.addr=0.0.0.0
        - --authrpc.port=8551
        - --authrpc.vhosts=*
        #- --txpool.api.addr=127.0.0.1:9094
        #- --sentry.api.addr=127.0.0.1:9091
        - --downloader.api.addr=127.0.0.1:9093
        - --chain=mainnet
        - --datadir=/home/erigon/.local/share/erigon3
        - --pprof
        - --metrics
        - --pprof.addr=0.0.0.0
        - --metrics.addr=0.0.0.0
        - --metrics.port=6060
        - --pprof.port=16060
        - --db.size.limit=5000000000000
        #- --log.console.verbosity=4
        - --externalcl
        image: thorax/erigon:v3.0.0-alpha2
        imagePullPolicy: Always
        name: erigon-core
        ports:
        - containerPort: 9090
          name: private
          protocol: TCP
        - containerPort: 8551
          name: engine
          protocol: TCP
        resources:
          limits:
            cpu: "4"
            memory: 40Gi
          requests:
            cpu: "4"
            memory: 8Gi
        securityContext:
          allowPrivilegeEscalation: true
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /home/erigon/.local/share/erigon3
          name: data
      - command:
        - downloader
        - --datadir=/home/erigon/.local/share/erigon3
        #- --pprof
        #- --metrics
        #- --pprof.addr=0.0.0.0
        #- --metrics.addr=0.0.0.0
        #- --metrics.port=6063
        #- --pprof.port=16063
        - --torrent.port=42068
        - --torrent.download.slots=16
        - --torrent.download.rate=500mb
        - --torrent.upload.rate=50mb
        - --chain=mainnet
        #- --log.console.verbosity=4
        image: thorax/erigon:v3.0.0-alpha2
        imagePullPolicy: Always
        name: erigon-downloader
        ports:
        - containerPort: 42068
          name: torrent
          protocol: UDP
        resources:
          limits:
            cpu: 2048m
            memory: 8Gi
          requests:
            cpu: 64m
            memory: 512Mi
        securityContext:
          allowPrivilegeEscalation: true
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /home/erigon/.local/share/erigon3
          name: data
      dnsPolicy: ClusterFirst
      initContainers:
      - command:
        - sh
        - -c
        - chown 1000 -R /home/erigon/.local/share/erigon3
        image: busybox
        imagePullPolicy: Always
        name: chown-datadir
        resources: {}
        securityContext:
          capabilities:
            add:
            - CHOWN
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /home/erigon/.local/share/erigon3
          name: data
Giulio2002 commented 4 weeks ago

I meant to NOT run downloader in a different command

munsayac13 commented 4 weeks ago

Thanks. Looks like its working without downloader... Did erigon stop supporting separate downloader? If not, how can I get it to work like what Im using in our system?

DBUG] [08-15|20:58:53.064] [snapshots] bittorrent peers             file=idx/v1-logaddrs.1216-1280.ef
[DBUG] [08-15|20:58:53.064] [snapshots] progress                     file=v1-020430-020440-transactions.seg progress=75.30% peers=0 webseeds=1
[DBUG] [08-15|20:58:53.065] [snapshots] webseed peers                file=v1-020430-020440-transactions.seg erigon3-v1-snapshots-mainnet.erigon.network=1.4MB/s
[DBUG] [08-15|20:58:53.065] [snapshots] bittorrent peers             file=v1-020430-020440-transactions.seg
[DBUG] [08-15|20:58:53.065] [snapshots] progress                     file=domain/v1-storage.1600-1602.kv progress=1.29% peers=0 webseeds=1
[DBUG] [08-15|20:58:53.065] [snapshots] webseed peers                file=domain/v1-storage.1600-1602.kv erigon3-v1-snapshots-mainnet.erigon.network/domain=13.1KB/s
[DBUG] [08-15|20:58:53.065] [snapshots] bittorrent peers             file=domain/v1-storage.1600-1602.kv
[DBUG] [08-15|20:58:53.066] [snapshots] progress                     file=history/v1-accounts.1472-1536.v progress=0.06% peers=0 webseeds=3
[DBUG] [08-15|20:58:53.066] [snapshots] webseed peers                file=history/v1-accounts.1472-1536.v erigon3-v1-snapshots-mainnet.erigon.network/history=13.3KB/s erigon3-v3-snapshots-mainnet.erigon.network/history=13.3KB/s erigon3-v3-snapshots-mainnet.erigon.network/v2/history=13.4KB/s
[DBUG] [08-15|20:58:53.066] [snapshots] bittorrent peers             file=history/v1-accounts.1472-1536.v
[DBUG] [08-15|20:58:53.068] [snapshots] progress                     file=v1-003500-004000-transactions.seg progress=75.42% peers=0 webseeds=5
[DBUG] [08-15|20:58:53.068] [snapshots] webseed peers                file=v1-003500-004000-transactions.seg erigon3-v3-snapshots-mainnet.erigon.network/v2=1.5MB/s erigon2-v1-snapshots-mainnet.erigon.network=1.5MB/s erigon2-v3-snapshots-mainnet.erigon.network=1.4MB/s erigon3-v1-snapshots-mainnet.erigon.network=2.6MB/s erigon3-v3-snapshots-mainnet.erigon.network=1.6MB/s
[DBUG] [08-15|20:58:53.068] [snapshots] bittorrent peers             file=v1-003500-004000-transactions.seg
[DBUG] [08-15|20:58:53.069] [snapshots] progress                     file=history/v1-code.1088-1152.v progress=0.49% peers=0 webseeds=3
[DBUG] [08-15|20:58:53.069] [snapshots] webseed peers                file=history/v1-code.1088-1152.v erigon3-v1-snapshots-mainnet.erigon.network/history=0B/s erigon3-v3-snapshots-mainnet.erigon.network/history=65B/s erigon3-v3-snapshots-mainnet.erigon.network/v2/history=0B/s
[DBUG] [08-15|20:58:53.069] [snapshots] bittorrent peers             file=history/v1-code.1088-1152.v
[DBUG] [08-15|20:58:53.074] [snapshots] progress                     file=v1-019900-020000-transactions.seg progress=0.09% peers=0 webseeds=2
[DBUG] [08-15|20:58:53.074] [snapshots] webseed peers                file=v1-019900-020000-transactions.seg erigon3-v1-snapshots-mainnet.erigon.network=11.4KB/s erigon3-v3-snapshots-mainnet.erigon.network/v2=11.6KB/s
[DBUG] [08-15|20:58:53.074] [snapshots] bittorrent peers             file=v1-019900-020000-transactions.seg
Giulio2002 commented 4 weeks ago

No, its just that we introduced OtterSync and that might have disrupted the flow :D. I will create another ticket tommorow to better track this issue

munsayac13 commented 4 weeks ago

since version 3 still at alpha2.... perhaps in the future it will be fixed? or depracated?

AskAlexSharov commented 4 weeks ago

i don't see stuck: file=v1-003500-004000-transactions.seg erigon3-v3-snapshots-mainnet.erigon.network/v2=1.5MB/s downloading files