redis / redis

Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs, Bitmaps.
http://redis.io
Other
66.95k stars 23.8k forks source link

BUG: Redis Sentinel (7.2.4) liveness or readiness probe timeout #13636

Open s0uky opened 4 days ago

s0uky commented 4 days ago

We’re encountering an issue with Redis in Sentinel mode. Everything works as expected, but the Redis and Sentinel containers in the pod are reporting timeouts on probes. Fortunately, the containers are not restarting. Somebody also reported similar issue in the https://github.com/bitnami/charts/issues/29985

If I exec into the pod and run some of the probes directly (e.g. /health/redis_liveness.sh) everything works as expected and response is PONG and RC is correctly 0.

Events:
  Type     Reason     Age                      From     Message
  ----     ------     ----                     ----     -------
  Warning  Unhealthy  37m (x619 over 2d18h)    kubelet  Liveness probe failed: command "sh -c /health/redis_liveness.sh" timed out
  Warning  Unhealthy  22m (x638 over 2d18h)    kubelet  Liveness probe failed: command "sh -c /health/sentinel_liveness.sh" timed out
  Warning  Unhealthy  12m (x664 over 2d18h)    kubelet  Readiness probe failed: command "sh -c /health/redis_readiness.sh" timed out
  Warning  Unhealthy  2m48s (x652 over 2d18h)  kubelet  Readiness probe failed: command "sh -c /health/sentinel_liveness.sh" timed out
        livenessProbe:
          exec:
            command:
            - sh
            - -c
            - /health/redis_liveness.sh
          failureThreshold: 5
          initialDelaySeconds: 30
          periodSeconds: 20
          successThreshold: 1
          timeoutSeconds: 40

        readinessProbe:
          exec:
            command:
            - sh
            - -c
            - /health/redis_readiness.sh
          failureThreshold: 5
          initialDelaySeconds: 30
          periodSeconds: 20
          successThreshold: 1
          timeoutSeconds: 40

To reproduce

Run Sentinel mode of Redis i 3 replicas.

Expected behavior

No timeouts on probes.

Additional information Kubernetes 1.30.5 OS: Ubuntu 22.04.5 LTS
Kernel: 5.15.0-122-generic
CNI: Cilium 1.16.2 containerd://1.6.30

lukastopiarz commented 4 days ago

Another finding: If you exec into the container and run the same liveness/readiness probe shell script in an infinite loop, it never fails or timeouts.