solana-labs / solana

Web-Scale Blockchain for fast, secure, scalable, decentralized apps and marketplaces.
https://solanalabs.com
Apache License 2.0
13.2k stars 4.3k forks source link

Solana crashes when sysctl can't be accessed #35329

Open erik78se opened 8 months ago

erik78se commented 8 months ago

When I'm starting Solana on my baremetal server which has been hardened to disallow access to sysctl (for security related reasons).

Solana crashes.

[2024-02-27T10:25:59.958543482Z WARN  solana_perf] CUDA is disabled
[2024-02-27T10:25:59.958580863Z INFO  solana_perf] AVX detected
[2024-02-27T10:25:59.958587003Z INFO  solana_perf] AVX2 detected
[2024-02-27T10:26:00.282695872Z INFO  solana_validator] obtained shred-version 35459 from 139.178.68.207:8001
[2024-02-27T10:26:00.283062450Z INFO  solana_metrics::metrics] metrics disabled: environment variable not found
[2024-02-27T10:26:00.283299354Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.rmem_max: no such sysctl: net.core.rmem_max
[2024-02-27T10:26:00.283319252Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.optmem_max: no such sysctl: net.core.optmem_max
[2024-02-27T10:26:00.283330789Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.netdev_max_backlog: no such sysctl: net.core.netdev_max_backlog
[2024-02-27T10:26:00.283341876Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.wmem_max: no such sysctl: net.core.wmem_max
[2024-02-27T10:26:00.283352640Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.rmem_default: no such sysctl: net.core.rmem_default
[2024-02-27T10:26:00.283362472Z ERROR solana_core::system_monitor_service] Failed to query value for net.core.wmem_default: no such sysctl: net.core.wmem_default
[2024-02-27T10:26:00.283370299Z WARN  solana_core::system_monitor_service]   vm.max_map_count: recommended=1000000 current=262144, too small
[2024-02-27T10:26:00.283376815Z WARN  solana_core::system_monitor_service]   net.core.rmem_max: recommended=134217728 current=-1, too small
[2024-02-27T10:26:00.283384814Z WARN  solana_core::system_monitor_service]   net.core.optmem_max: recommended=0 current=-1, too small
[2024-02-27T10:26:00.283390150Z WARN  solana_core::system_monitor_service]   net.core.netdev_max_backlog: recommended=0 current=-1, too small
[2024-02-27T10:26:00.283397157Z WARN  solana_core::system_monitor_service]   net.core.wmem_max: recommended=134217728 current=-1, too small
[2024-02-27T10:26:00.283401841Z WARN  solana_core::system_monitor_service]   net.core.rmem_default: recommended=134217728 current=-1, too small
[2024-02-27T10:26:00.283406696Z WARN  solana_core::system_monitor_service]   net.core.wmem_default: recommended=134217728 current=-1, too small
OS network limit test failed. See: https://docs.solana.com/running-validator/validator-start#system-tuning

I would rather see the detection of these values handle this gracefully and not crash. Perhaps I can submit a patch that won't error out but simply warn?

https://github.com/solana-labs/solana/blob/8ad125d0c0688aaf2b62bb95b535ff988ed7f9ac/core/src/system_monitor_service.rs#L433

steviez commented 8 months ago

For the general case, this check is helpful and provides an immediate failure / useful error message. That being said, if you're setting up your system in a particular manner that disallows the check from working and you're confident that you're tuning it appropriately, you can bypass the check with:

--no-os-network-limits-test

https://github.com/solana-labs/solana/blob/8ad125d0c0688aaf2b62bb95b535ff988ed7f9ac/validator/src/main.rs#L1719-L1726