cal-itp / data-infra

Cal-ITP data infrastructure
https://docs.calitp.org/data-infra
GNU Affero General Public License v3.0
48 stars 13 forks source link

fix(sentry): limit redis to 1 replica and add resource limits #3462

Closed themightychris closed 2 months ago

themightychris commented 2 months ago

Description

Redis was configured to run 3 replicas + 1 master, as this is the default for the upstream Redis chart that the Sentry chart utilizes. Newer versions of the Sentry chart than we're running right now override the default to 1 replica.

Further, by default no resource limits are applied to the Redis instances

As a result, we have 4 Redis instances running with unlimited memory which they will gradually use as much as they can over time. Our cluster is configured to autoscale between 3 and 6 nodes. The 4 redis instances would fight to eat all the memory on the 3 nodes. Eventually GCP's autoscaler would kick in and add a 4th node and move one of the Redis instances there. The fresh Redis instance we start off with low memory usage, and then the autoscaler would remove the 4th node and shove everything back into 3 nodes. The cycle continues

I suspect we don't really need 3 Redis replicas running for our Sentry instance and am going to try setting it to 1 replica per the Sentry helm chart's new default. If we see performance issues with Sentry we can scale it back up. I'm also applying a 2-3gb memory limit on the Redis replicas

Type of change

How has this been tested?

Opening this pull request will generate a diff

Post-merge follow-ups

github-actions[bot] commented 2 months ago

The following changes will be applied to the production Kubernetes cluster upon merge.

BE AWARE this may not reveal changes that have been manually applied to the cluster getting undone—applying manual changes to the cluster should be avoided.

sentry, sentry-sentry-redis-replicas, StatefulSet (apps) has changed:
...
      helm.sh/chart: redis-17.11.3
      app.kubernetes.io/instance: sentry
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/component: replica
  spec:
-   replicas: 3
+   replicas: 1
    selector:
      matchLabels:
        app.kubernetes.io/name: sentry-redis
        app.kubernetes.io/instance: sentry
        app.kubernetes.io/component: replica
...
                command:
                  - sh
                  - -c
                  - /health/ping_readiness_local_and_master.sh 1
            resources:
-             limits: {}
-             requests: {}
+             limits:
+               cpu: 2
+               memory: 3Gi
+             requests:
+               cpu: 2
+               memory: 2Gi
            volumeMounts:
              - name: start-scripts
                mountPath: /opt/bitnami/scripts/start-scripts
              - name: health
                mountPath: /health
...