Closed karthicksndr closed 1 month ago
Hi,
I see in the screenshot that it is adding more details like "command ..."
but I cannot see it. Could you share more details on these kubelet logs?
Thanks @javsalgar for looking into it.
k8s-node-vm-logs-redis-sentinel-restart.txt
Attached Kubelet logs. Note that the liveness probes were failing from 20:55 and the pod restarted around 20:57 after 5 liveness probe failures.
The only thing that comes to mind, because of this context deadline exceeded is some sort of connection issue due to the networking. I'm afraid that going further is a bit beyond of the support we can offer, but let's see if someone from the community could provide some insight on what could be happening.
If these issues are something transient, maybe you could try increasing the tolerance of the probes.
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.
Keeping this open for other engineers to comment.
Name and Version
bitnami/redis 18.18.0
What architecture are you using?
None
What steps will reproduce the bug?
Are you using any custom parameters or values?
Deploy the helm chart with sentinel enabled and setting the number of replicas to 3
What is the expected behavior?
No Sentinel container restarts with liveness probe failures.
What do you see instead?
Kubelet logs:
Memory usage: not even 1/10th of the memory limits (500 Mi) and 1/5th of memory requests (250 Mi)
CPU throttling:
Container resources at runtime:
Additional information
Outstanding question: Why the liveness probe fails
Sometimes with Exit code 137 Sometimes with Exit code 0 ( Purposely stopped ) with "ExecSync cmd from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded"
What we have validated:
No resource issues in container (memory and CPU) No resource issues in other containers in the pod (redis, metrics, fluentbit) No resource issues in the node