scylladb / scylla-operator

The Kubernetes Operator for ScyllaDB
https://operator.docs.scylladb.com/
Apache License 2.0
340 stars 175 forks source link

Set open file limit for ScyllaDB processes #2160

Closed zimnx closed 1 month ago

zimnx commented 1 month ago

Description of your changes:

ScyllaDB, during regular operation, may need to manage millions of open files due to the nature of its workload and architecture. The open file limit (rlimit) for containers is inherited from the CRI and systemd, both of which tend to set conservative limits to avoid misbehavior in other programs when high limits are applied. Simply setting fs.nr_open using ScyllaCluster sysctls API is insufficient to raise these limits for ScyllaDB process.

To automate setting it, Scylla Operator NodeConfig container optimization was extended with additional Job discovering the maximum possible limit and setting it on main process of ScyllaDB containers. ScyllaDB Pods await until limit is changed before starting ScyllaDB process. Any forks (sidecar starter or hypervisor) should inherit the limits.

Users should increase fs.nr_open to at least value recommended by ScyllaDB, because defaults of popular Container Runtimes are ~1024 times lower. Sysctls can currenly be changed via scylladbcluster.spec.sysctls field. Note that this tuning is applied only on Nodes matching deployed NodeConfig selector.

Which issue is resolved by this Pull Request: Resolves #2131

zimnx commented 1 month ago

Manager flake - https://github.com/scylladb/scylla-operator/issues/2061#issuecomment-2426814613 /retest

zimnx commented 1 month ago

Flake - https://github.com/scylladb/scylla-operator/issues/2096#issuecomment-2429903073 /retest

tnozicka commented 1 month ago

/approve

/assign @rzetelskik (I'll be on PTO till Tuesday)

scylla-operator-bot[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tnozicka, zimnx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/scylladb/scylla-operator/blob/master/OWNERS)~~ [tnozicka,zimnx] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
rzetelskik commented 1 month ago

/lgtm thanks