Open bLd75 opened 9 months ago
It is resolved right?
@ashutoshvarma we have OOM case currently ongoing on Astar latest version. @paradox-tt will provide details here.
We're still uplifting & catching up to the latest version.
But please do provide command, environment & logs if you have them.
Hey team,
Here's my flags with public address hidden
ExecStart=/usr/local/bin/astar-collator \
--validator \
--rpc-cors all \
--name Dox-Astar-01 \
--execution wasm \
--state-cache-size 1 \
--chain astar \
--public-addr=/ip4/x.x.x.x/tcp/30330 \
--listen-addr=/ip4/172.19.12.15/tcp/30330 \
--bootnodes /ip4/20.93.150.146/tcp/30330/p2p/12D3KooWKZwcaofXPmXWHSSfnh34VFJ8zSRJScnNu9UA75x8kNXi \
--allow-private-ipv4 \
--discover-local \
--rpc-port=9110 \
--prometheus-external \
--prometheus-port=9702 \
--rpc-methods=Unsafe \
# --sync=warp \
--blocks-pruning=1000 \
--state-pruning=1000 \
--telemetry-url 'wss://telemetry-backend.w3f.community/submit/ 1' \
--telemetry-url 'wss://telemetry.polkadot.io/submit/ 1' \
# --relay-chain-rpc-urls "wss://rpc.ibp.network/polkadot" \
There's no error in the logs, except that warping continues until the server's out of memory or the instance reboots
Jun 12 11:23:56 doxastar astar-collator[52243]: 2024-06-12 11:23:56 [Parachain] ⏩ Warping, Downloading state, 406.43 Mib (22 peers), best: #0 (0x9eb7…29c6), finalized #0 (0x9eb7…29c6), ⬇ 0.7kiB/s ⬆ 0.4kiB/s
Jun 12 11:24:00 doxastar astar-collator[52243]: 2024-06-12 11:24:00 [Relaychain] ✨ Imported #21184086 (0x5200…14a5)
Jun 12 11:24:01 doxastar astar-collator[52243]: 2024-06-12 11:24:01 [Relaychain] 💤 Idle (15 peers), best: #21184086 (0x5200…14a5), finalized #21184083 (0xb341…3109), ⬇ 145.2kiB/s ⬆ 192.6kiB/s
Jun 12 11:24:02 doxastar astar-collator[52243]: 2024-06-12 11:24:01 [Parachain] ⏩ Warping, Downloading state, 409.57 Mib (22 peers), best: #0 (0x9eb7…29c6), finalized #0 (0x9eb7…29c6), ⬇ 272.5kiB/s ⬆ 0.9kiB/s
Jun 12 11:24:06 doxastar astar-collator[52243]: 2024-06-12 11:24:06 [Relaychain] ✨ Imported #21184087 (0x6345…6470)
-- Boot 5e8c89c8388a471daea298612802f1e0 --
Jun 12 11:27:01 doxastar systemd[1]: Started Astar Node.
Jun 12 11:27:01 doxastar astar-collator[738]: `--state-cache-size` was deprecated. Please switch to `--trie-cache-size`.
Jun 12 11:27:01 doxastar astar-collator[738]: CLI parameter `--execution` has no effect anymore and will be removed in the future!
Jun 12 11:27:01 doxastar astar-collator[738]: 2024-06-12 11:27:01 Astar Collator
Jun 12 11:27:01 doxastar astar-collator[738]: 2024-06-12 11:27:01 ✌️ version 5.39.1-111d18fbfba
Jun 12 11:27:01 doxastar astar-collator[738]: 2024-06-12 11:27:01 ❤️ by Stake Technologies <devops@stake.co.jp>, 2019-2024
Jun 12 11:27:01 doxastar astar-collator[738]: 2024-06-12 11:27:01 📋 Chain specification: Astar
Jun 12 11:27:01 doxastar astar-collator[738]: 2024-06-12 11:27:01 🏷 Node name: Dox-Astar-01
Jun 12 11:27:01 doxastar astar-collator[738]: 2024-06-12 11:27:01 👤 Role: AUTHORITY
Jun 12 11:27:01 doxastar astar-collator[738]: 2024-06-12 11:27:01 💾 Database: RocksDb at /home/astar_1/.local/share/astar-collator/ch
Update on pre v5.42.0 client test: the issue is still the same.
In my tests with 32GB RAM, the node always gets OOM at the same time: importing state at 5762.42 Mib
.
Once it arrives at this state size, suddenly memory gets filled and bursts to 100% in less than 2 minutes.
I can't see any significant correlation do disk usage, meaning the problem is targetted on RAM usage by warp sync.
More insights on memory on short time frame
Description
Warp sync in not operational on Astar in latest versions, after downloading state (5.3+ Gb), import state triggers OOM on the server
Steps to Reproduce
Start Astar node sync with
--sync warp
optionEnvironment
Quite similar to this issue but on para side.
Issue will be solved after uplifting to Polkadot v1.0.0