Does sentinel listen to SIGKILL?

igoooor commented 5 months ago

Expected behaviour

I thought that if the sentinel pod receives a SIGKILL, it would make it self "not ready" so that loadbalancer service stops sending requests.

Actual behaviour

My pod got stopped because of node scaling down (so pod got moved to a different node), and a request still went to that sentinel pod, which ended up in an error that "redis went away"

Environment

How are the pieces configured?

Redis Operator version: v1.2.1
Kubernetes version: v1.25.15-gke.1115000
Kubernetes configuration used: not sure what to answer here sorry

Logs

Container logs at the time of the event:

INFO 2024-02-02T10:34:45.033200246Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:34:45.033 # +set master mymaster 10.2.85.14 6379 failover-timeout 3000
INFO 2024-02-02T10:34:45.038454080Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:34:45.038 * Sentinel new configuration saved on disk
INFO 2024-02-02T10:35:15.528001493Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:35:15.527 # +set master mymaster 10.2.85.14 6379 down-after-milliseconds 1500
INFO 2024-02-02T10:35:15.532660946Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:35:15.532 * Sentinel new configuration saved on disk
INFO 2024-02-02T10:35:15.533390024Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:35:15.532 # +set master mymaster 10.2.85.14 6379 failover-timeout 3000
INFO 2024-02-02T10:35:15.538118999Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:35:15.537 * Sentinel new configuration saved on disk
INFO 2024-02-02T10:35:45.424395751Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:35:45.424 # +set master mymaster 10.2.85.14 6379 down-after-milliseconds 1500
INFO 2024-02-02T10:35:45.430492822Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:35:45.430 * Sentinel new configuration saved on disk
INFO 2024-02-02T10:35:45.430856163Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:35:45.430 # +set master mymaster 10.2.85.14 6379 failover-timeout 3000
INFO 2024-02-02T10:35:45.435810640Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:35:45.435 * Sentinel new configuration saved on disk
INFO 2024-02-02T10:36:15.728698109Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:36:15.727 # +set master mymaster 10.2.85.14 6379 down-after-milliseconds 1500
INFO 2024-02-02T10:36:15.733848405Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:36:15.733 * Sentinel new configuration saved on disk
INFO 2024-02-02T10:36:15.734494313Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:36:15.734 # +set master mymaster 10.2.85.14 6379 failover-timeout 3000
INFO 2024-02-02T10:36:15.740620714Z [resource.labels.containerName: sentinel] 1:X 02 Feb 2024 10:36:15.740 * Sentinel new configuration saved on disk

Nothing special I see there. Pod logs at the time of the event:

INFO 2024-02-02T10:36:38Z [resource.labels.podName: rfs-conteo-prod-redis-7fc8844655-2qbcw] deleting pod for node scale down
INFO 2024-02-02T10:36:39Z [resource.labels.podName: rfs-conteo-prod-redis-7fc8844655-2qbcw] Stopping container sentinel
WARNING 2024-02-02T10:36:40Z [resource.labels.podName: rfs-conteo-prod-redis-7fc8844655-2qbcw] Readiness probe errored: rpc error: code = NotFound desc = failed to exec in container: failed to load task: no running task found: task f2ca421fd292a37ed46ea1dab6ac5116b2820a8051078765b1a1004a438e4455 not found: not found

github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 45 days with no activity.

github-actions[bot] commented 3 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.

spotahome / redis-operator