authzed / spicedb-operator

Kubernetes controller for managing instances of SpiceDB
Apache License 2.0
62 stars 26 forks source link

Add default resource request/limits for SpiceDB Cluster #247

Open jawnsy opened 11 months ago

jawnsy commented 11 months ago

Summary

Add CPU and memory request/limits for SpiceDB Cluster deployment

Details

The SpiceDB Deployment manifests are missing resource specs, which means that the resulting pod will have BestEffort quality of service. This may be non-trivial to set to a reasonable value because resource usage may differ according to user workloads.

For completeness, adding a request would be useful to force the cluster autoscaler to make room for the operator (guaranteeing that some forward progress will be made) and limits would be useful to make issues like leaks more apparent.

Proposal

Add a small request and reasonably high limit for the deployment pod spec based on current usage. This will vary based on usage, so having an option in the CRD to override it seems prudent.

As a starting point, something like this seems suitable:

    resources:
      requests:
        memory: "256Mi"
        cpu: "250m"
      limits:
        memory: "1Gi"
        cpu: "2000m"
ecordell commented 11 months ago

This is possible today via the patches API:

spec:
  patches:
  - kind: Deployment
    patch:
      spec:
        template:
          spec:
            containers:
            - name: spicedb
               resources:
                 requests:
                   memory: "256Mi"
                   cpu: "250m"
                 limits:
                   memory: "1Gi"
                   cpu: "2000m"

Although as you note, it's usually better to run SpiceDB in a guaranteed QoS class, and for production clusters we tend to use static cpu allocation as well.

I have considered that it could be useful to have something like this instead just to remove some nesting:

spec:
  patches:
  - container: spicedb
    patch:
      resources:
         requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "2000m"

but the current example is only a couple more nested levels than the Deployment API itself, which folks generally have no problem writing.

n0rthernstar commented 9 months ago

Hello @ecordell could you please clarify if the mentioned example need to be added into the yaml of kind: SpiceDBCluster:

apiVersion: authzed.com/v1alpha1
kind: SpiceDBCluster
metadata:
  name: spicedb-cluser
  namespace: spicedb
spec:
  patches:
    - container: spicedb
      patch:
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
  config:
    replicas: 1
    datastoreEngine: "postgres"
    telemetryEndpoint: ""
  secretName: spicedb-config

or in another?

ecordell commented 9 months ago

Yep, it's in the SpiceDBCluster API.

Full example:

apiVersion: authzed.com/v1alpha1
kind: SpiceDBCluster
metadata:
  name: spicedb-cluster
  namespace: spicedb
spec:
  config:
    replicas: 1
    datastoreEngine: "postgres"
    telemetryEndpoint: ""
  secretName: spicedb-config
  patches:
  - kind: Deployment
    patch:
      spec:
        template:
          spec:
            containers:
            - name: spicedb
               resources:
                 requests:
                   memory: "256Mi"
                   cpu: "250m"
                 limits:
                   memory: "1Gi"
                   cpu: "2000m"