Open KaiQiu9527 opened 5 months ago
The last minute log before OOM is attached: solana-rpc-last-1-minute.log
Get a machine with more memory
Get a machine with more memory
The recomment specification is 512G mem, while it was running well on 64U256G machine on GPC. It's weried.
Don't believe the recomment, the first time I used 512Gmem, and then I couldn't catch up with the height, how to fix it can't make, change to 768G good
Don't believe the recomment, the first time I used 512Gmem, and then I couldn't catch up with the height, how to fix it can't make, change to 768G good
What's the usage of memory when you changed to 768G?
In the case of swap, it is now 65%
In the case of swap, it is now 65%
Hi bro, where are you from? Can we DM? I need some experience on solana deployment from you😊
@KaiQiu9527 I'm sharing with you an experience I had recently...
Maybe running validator using this param --no-skip-initial-accounts-db-clean
can help you avoid OOM on hardware with less Memory.
It'll force process accounts-db before start syncing. The down side of it is your node will become far slots behind after any restart on validator and take longer to catchup latest slots. So it's a trade-off.
We have high requirements for the height of slot. I have referred to your suggestion on this parameter, but it cannot meet our requirements. Thank you all the same
@KaiQiu9527
In the case of swap, it is now 65%
Hi bro, where are you from? Can we DM? I need some experience on solana deployment from you😊
yes,bro. my email addreee : bearichman66688@gmail.com , email me pls.
Problem
Proposed Solution
Hi Dev Team: I'm running a solana rpc node in Hong Kong Region on Huawei Cloud. It was running well on the first few hours, while it will became fall behind and the memory usage will grow rapidly until being killed for OOM. I have try different specifications, like 32u256g and 64u512g, they both crashed for OOM. Is there any way to find out the reason?
Specification: Machine: m7n.16xlarge.8 (64U512g, 3rd Generation Intel® Xeon® Scalable Processor) Disk: 2T ESSD(I/O throughput up to 1000MB/S) Network: 10Gbps
validator.sh is:
!/bin/bash
exec solana-validator \ --identity /home/sol/validator-keypair.json \ --known-validator 7Np41oeYqPefeNQEHSv1UDhYrehxin3NStELsSKCT4K2 \ --known-validator GdnSyH3YtwcxFvQrVVJMm1JhTS4QVX7MFsX56uJLUfiZ \ --known-validator DE1bawNcRJB9rVm3buyMVfr8mBEoyyu73NBovf2oXJsJ \ --known-validator CakcnaRDHka2gXyfbEd2d3xsvkJkqsLw2akB3zsN1D2S \ --full-rpc-api \ --no-voting \ --ledger /mnt/solana/ledger \ --accounts /mnt/solana/accounts \ --log /home/sol/solana-rpc.log \ --rpc-port 8899 \ --rpc-bind-address 0.0.0.0 \ --private-rpc \ --dynamic-port-range 8000-8020 \ --entrypoint entrypoint.mainnet-beta.solana.com:8001 \ --entrypoint entrypoint2.mainnet-beta.solana.com:8001 \ --entrypoint entrypoint3.mainnet-beta.solana.com:8001 \ --entrypoint entrypoint4.mainnet-beta.solana.com:8001 \ --entrypoint entrypoint5.mainnet-beta.solana.com:8001 \ --expected-genesis-hash 5eykt4UsFv8P8NJdTREpY1vzqKqZKvdpKuc147dw2N9d \ --wal-recovery-mode skip_any_corrupted_record \ --limit-ledger-size \ --maximum-local-snapshot-age 20000