erigontech / erigon

Ethereum implementation on the efficiency frontier https://erigon.gitbook.io
GNU Lesser General Public License v3.0
3.14k stars 1.12k forks source link

Erigon is killed by oom several times a day #11315

Open aglumov opened 3 months ago

aglumov commented 3 months ago

System information

Erigon version: 2.58.1

OS & Version: Ubuntu 20.04.6

Hardware details: 32Gb, 8 vcpu (Intel Xeon Gold 6240R CPU @ 2.40GHz), ssd 5Tb

Commit hash: f12e451

Erigon Command (with flags/config):

/data1/erigon/erigon-new/erigon --datadir="/data1/erigon/eth-data/erigon" --chain=mainnet --port=30333 --http.port=8555 --http.addr=0.0.0.0 --torrent.port=42069 --private.api.addr=127.0.0.1:9090 --http --ws --http.api=eth,debug,net,trace,web3,erigon --private.api.ratelimit=1024 --rpc.batch.concurrency=50 --rpc.batch.limit=10000 --pprof --authrpc.port=8551 --authrpc.addr=0.0.0.0  --authrpc.jwtsecret=/data1/prysm/jwt.hex --internalcl --db.size.limit=8TB

I start erigon using systemd unit:

[Unit]
After=network-online.target
Wants=network-online.target

[Service]
WorkingDirectory=/data1/erigon
User=ethereum
ExecStart=/data1/erigon/erigon-new/erigon --datadir="/data1/erigon/eth-data/erigon" --chain=mainnet --port=30333 --http.port=8555 --http.addr=0.0.0.0 --torrent.port=42069 --private.api.addr>
Restart=always
RestartSec=5s

[Install]
WantedBy=multi-user.target

Consensus Layer: Caplin

Consensus Layer Command (with flags/config): --internalcl

Chain/Network: Mainnet

Expected behaviour

Not being killed by oom.

Actual behaviour

Several times a day erigon is killed by oom killer:

Jul 21 17:33:32 pb2-eth-node kernel: Out of memory: Killed process 594748 (erigon) total-vm:13111818316kB, anon-rss:21907580kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:7717348kB oom_score_adj:0
Jul 21 22:42:09 pb2-eth-node kernel: Out of memory: Killed process 595026 (erigon) total-vm:13112313816kB, anon-rss:22187200kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:6858340kB oom_score_adj:0
Jul 22 02:47:16 pb2-eth-node kernel: Out of memory: Killed process 595400 (erigon) total-vm:13111874016kB, anon-rss:21992692kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:6515312kB oom_score_adj:0
Jul 22 07:52:21 pb2-eth-node kernel: Out of memory: Killed process 595703 (erigon) total-vm:13112835560kB, anon-rss:21884692kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:7800276kB oom_score_adj:0
Jul 23 16:18:31 pb2-eth-node kernel: Out of memory: Killed process 599212 (erigon) total-vm:13113593832kB, anon-rss:21997328kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:7561968kB oom_score_adj:0
Jul 23 21:33:43 pb2-eth-node kernel: Out of memory: Killed process 599504 (erigon) total-vm:13114050032kB, anon-rss:22396812kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:6683428kB oom_score_adj:0
Jul 24 03:19:17 pb2-eth-node kernel: Out of memory: Killed process 599955 (erigon) total-vm:13113321764kB, anon-rss:22177924kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:6763196kB oom_score_adj:0
Jul 24 06:56:31 pb2-eth-node kernel: Out of memory: Killed process 600398 (erigon) total-vm:13112147296kB, anon-rss:21896528kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:7619416kB oom_score_adj:0
Jul 24 10:11:01 pb2-eth-node kernel: Out of memory: Killed process 600702 (erigon) total-vm:13112162736kB, anon-rss:21734744kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:7571756kB oom_score_adj:0
Jul 24 15:44:19 pb2-eth-node kernel: Out of memory: Killed process 600918 (erigon) total-vm:13113219868kB, anon-rss:22290816kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:7742184kB oom_score_adj:0

Also memory consumption from monitoring: image

Steps to reproduce the behaviour

Run erigon as systemd unit.

Backtrace

[backtrace]
AskAlexSharov commented 3 months ago

try upgrade

yperbasis commented 3 months ago

Plz reopen if this happens with v2.60.3 or later.

aglumov commented 3 months ago

I've updated erigon to version 2.60.3: erigon version 2.60.3 But still it is killed by oom:

Jul 30 04:22:37 pb2-eth-node kernel: Out of memory: Killed process 612810 (erigon) total-vm:15281302600kB, anon-rss:21808336kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:8518816kB oom_score_adj:0
Jul 30 12:26:37 pb2-eth-node kernel: Out of memory: Killed process 613515 (erigon) total-vm:15280237428kB, anon-rss:21436236kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:8420324kB oom_score_adj:0
Jul 30 19:50:02 pb2-eth-node kernel: Out of memory: Killed process 614511 (erigon) total-vm:15280148940kB, anon-rss:21587412kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:8433700kB oom_score_adj:0
Jul 31 02:55:26 pb2-eth-node kernel: Out of memory: Killed process 614956 (erigon) total-vm:15282465020kB, anon-rss:21963860kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:8331140kB oom_score_adj:0
Jul 31 09:04:39 pb2-eth-node kernel: Out of memory: Killed process 615357 (erigon) total-vm:15280681452kB, anon-rss:21621124kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:8325604kB oom_score_adj:0

And memory usage for 30-31 July: image

kate3941 commented 2 months ago

@yperbasis @AskAlexSharov could you recommend any solution please?

aglumov commented 1 month ago

I've updated erigon to the latest version 2.60.7. Problem persists...

Sep 14 09:40:45 pb2-eth-node kernel: Out of memory: Killed process 989878 (erigon) total-vm:15288657052kB, anon-rss:21819476kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:9015920kB oom_score_adj:0
Sep 14 14:43:42 pb2-eth-node kernel: Out of memory: Killed process 993603 (erigon) total-vm:15285948168kB, anon-rss:22674420kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:7251196kB oom_score_adj:0
Sep 14 18:15:13 pb2-eth-node kernel: Out of memory: Killed process 993823 (erigon) total-vm:15286800316kB, anon-rss:22494120kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:8297756kB oom_score_adj:0
Sep 15 00:35:16 pb2-eth-node kernel: Out of memory: Killed process 994103 (erigon) total-vm:15286080432kB, anon-rss:22314488kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:8566776kB oom_score_adj:0
Sep 16 00:48:25 pb2-eth-node kernel: Out of memory: Killed process 994484 (erigon) total-vm:15286059144kB, anon-rss:21832744kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:8961940kB oom_score_adj:0

image

Any ideas?