Closed work4DevLMLOps closed 2 years ago
Hi @ping2kpm, thank you for reporting this issue.
I think that I know what may be happening.
Looks like, under certain circumstances, the Redis service may not be available after a server reboot, probably because all 3 pods started at the same time and none of them was recognized as a master by the others.
This caused each node to think they are the master, running into a split-brain scenario where the quorum condition will never be fulfilled:
+monitor master mymaster redis-node-1.redis-headless.redis.svc.cluster.local 6379 quorum 2
+monitor master mymaster redis-node-0.redis-headless.redis.svc.cluster.local 6379 quorum 2
+monitor master mymaster redis-node-2.redis-headless.redis.svc.cluster.local 6379 quorum 2
Caused by: https://github.com/bitnami/charts/blob/dcec32e41d1e608f4e1f784e8d15ac3f6853d296/bitnami/redis/templates/scripts-configmap.yaml#L81-L90 https://github.com/bitnami/charts/blob/dcec32e41d1e608f4e1f784e8d15ac3f6853d296/bitnami/redis/templates/scripts-configmap.yaml#L101-L108
When restarted normally, Redis will perform a RollingUpdate (unless you overwrite the value updateStrategy
), and nodes 1 and 2 won't start until the previous node is initialized and ready, but it is possible that if Kubernetes is rebooted it could be restarting all of pods simultaneously, although I'm not 100% sure about that.
I have created an internal task to take a deeper look at this race condition and see what can be done to protect Redis against this scenario.
At the moment, I think deleting Redis pods should resolve this and Kubernetes will recreate them ordered so they can assume a role different to master and start working.
Hi @ping2kpm,
Could you test this removing the timeout 5
bit from these two lines (83 and 85)?
get_sentinel_master_info() {
if is_boolean_yes "$REDIS_SENTINEL_TLS_ENABLED"; then
sentinel_info_command="{{- if and .Values.auth.enabled .Values.auth.sentinel }}REDISCLI_AUTH="\$REDIS_PASSWORD" {{ end }}timeout 5 redis-cli -h $REDIS_SERVICE -p $SENTINEL_SERVICE_PORT --tls --cert ${REDIS_SENTINEL_TLS_CERT_FILE} --key ${REDIS_SENTINEL_TLS_KEY_FILE} --cacert ${REDIS_SENTINEL_TLS_CA_FILE} sentinel get-master-addr-by-name {{ .Values.sentinel.masterSet }}"
else
sentinel_info_command="{{- if and .Values.auth.enabled .Values.auth.sentinel }}REDISCLI_AUTH="\$REDIS_PASSWORD" {{ end }}timeout 5 redis-cli -h $REDIS_SERVICE -p $SENTINEL_SERVICE_PORT sentinel get-master-addr-by-name {{ .Values.sentinel.masterSet }}"
fi
Hi @ping2kpm,
Could you test this removing the
timeout 5
bit from these two lines (83 and 85)?get_sentinel_master_info() { if is_boolean_yes "$REDIS_SENTINEL_TLS_ENABLED"; then sentinel_info_command="{{- if and .Values.auth.enabled .Values.auth.sentinel }}REDISCLI_AUTH="\$REDIS_PASSWORD" {{ end }}timeout 5 redis-cli -h $REDIS_SERVICE -p $SENTINEL_SERVICE_PORT --tls --cert ${REDIS_SENTINEL_TLS_CERT_FILE} --key ${REDIS_SENTINEL_TLS_KEY_FILE} --cacert ${REDIS_SENTINEL_TLS_CA_FILE} sentinel get-master-addr-by-name {{ .Values.sentinel.masterSet }}" else sentinel_info_command="{{- if and .Values.auth.enabled .Values.auth.sentinel }}REDISCLI_AUTH="\$REDIS_PASSWORD" {{ end }}timeout 5 redis-cli -h $REDIS_SERVICE -p $SENTINEL_SERVICE_PORT sentinel get-master-addr-by-name {{ .Values.sentinel.masterSet }}" fi
No luck 127.0.0.1:26379> sentinel master mymaster 1) "name" 2) "mymaster" 3) "ip" 4) "redis-node-0.redis-headless.redis.svc.cluster.local" 5) "port" 6) "6379" 7) "runid" 8) "efdbedd3d138e20483f11ea302263fb005ef164f" 9) "flags" 10) "master" 11) "link-pending-commands" 12) "0" 13) "link-refcount" 14) "1" 15) "last-ping-sent" 16) "0" 17) "last-ok-ping-reply" 18) "158" 19) "last-ping-reply" 20) "158" 21) "down-after-milliseconds" 22) "60000" 23) "info-refresh" 24) "757" 25) "role-reported" 26) "master" 27) "role-reported-time" 28) "602794" 29) "config-epoch" 30) "0" 31) "num-slaves" 32) "0" 33) "num-other-sentinels" 34) "0" 35) "quorum" 36) "2" 37) "failover-timeout" 38) "18000" 39) "parallel-syncs" 40) "1" 127.0.0.1:26379> sentinel slaves mymaster (empty array) 127.0.0.1:26379>
Can confirm - just encountered this today as we restarted all nodes in a cluster at the same time.
Hi,
I'm adding the 'on-hold' label so the stale-bot does not close this issue while we investigate this.
@ping2kpm, does the issue persists after the pods were deleted and recreated one by one? It would also help us if you could share the output of kubectl describe pod redis-node-0
.
No it doesn't persists, once pods are deleted & recreated cluster formed again without any error, after server reboot.
Pod described:
[root@server1 ~]# kubectl describe -n redis pods redis-node-0
Name: redis-node-0
Namespace: redis
Priority: 0
Node: server1.dev.wkelms.com/10.234.82.180
Start Time: Fri, 04 Feb 2022 18:08:33 +0000
Labels: app.kubernetes.io/component=node
app.kubernetes.io/instance=redis
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=redis
controller-revision-hash=redis-node-6f7cd45777
helm.sh/chart=redis-16.1.0
statefulset.kubernetes.io/pod-name=redis-node-0
Annotations: checksum/configmap: e5c6cdd414b147061893ad54903d32b99a8a6918ffc7c4688e8e6fc88d205738
checksum/health: 22f6a8a9b9adb4e72cc8a50f9fe62e0db9e2cca2e43657c039333c911a709210
checksum/scripts: dc8c62f1901eba5a8368920144ed4ebcd2936094eec063cbbb9a4ddd70dcd279
checksum/secret: db2bacaf687eeffa3ff3a3d1da2b2d69836b8e2045647ec7bacb2ffdc4a42b6b
prometheus.io/port: 9121
prometheus.io/scrape: true
Status: Running
IP: 10.42.0.131
IPs:
IP: 10.42.0.131
Controlled By: StatefulSet/redis-node
Containers:
redis:
Container ID: containerd://337728d497a8d180f7a21d3ca088a97e696f267c490555fa228ca37faa48ee38
Image: docker.io/bitnami/redis:6.2.6-debian-10-r103
Image ID: docker.io/bitnami/redis@sha256:3d6055b1addad726b590df6d75a538a64d29f0d44c0dcf39c855173c0a3eb2da
Port: 6379/TCP
Host Port: 0/TCP
Command:
/bin/bash
Args:
-c
/opt/bitnami/scripts/start-scripts/start-node.sh
State: Running
Started: Mon, 07 Feb 2022 16:02:51 +0000
Last State: Terminated
Reason: Unknown
Exit Code: 255
Started: Fri, 04 Feb 2022 18:08:34 +0000
Finished: Mon, 07 Feb 2022 16:01:52 +0000
Ready: True
Restart Count: 1
Liveness: exec [sh -c /health/ping_liveness_local.sh 5] delay=20s timeout=5s period=5s #success=1 #failure=5
Readiness: exec [sh -c /health/ping_readiness_local.sh 5] delay=20s timeout=1s period=5s #success=1 #failure=5
Environment:
BITNAMI_DEBUG: false
REDIS_MASTER_PORT_NUMBER: 6379
ALLOW_EMPTY_PASSWORD: no
REDIS_PASSWORD: <set to the key 'redis-password' in secret 'redis'> Optional: false
REDIS_MASTER_PASSWORD: <set to the key 'redis-password' in secret 'redis'> Optional: false
REDIS_TLS_ENABLED: no
REDIS_PORT: 6379
REDIS_DATA_DIR: /data
Mounts:
/data from redis-data (rw)
/health from health (rw)
/opt/bitnami/redis/etc from redis-tmp-conf (rw)
/opt/bitnami/redis/mounted-etc from config (rw)
/opt/bitnami/scripts/start-scripts from start-scripts (rw)
/tmp from tmp (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-sc754 (ro)
sentinel:
Container ID: containerd://e324bd7bc468903c2787a9de34204f34f200dee2ed32d76a32854e9b26457af9
Image: docker.io/bitnami/redis-sentinel:6.2.6-debian-10-r100
Image ID: docker.io/bitnami/redis-sentinel@sha256:af140136548ce0359e595ccd7b24a435b00549135ef77d38818601e2f17f90c7
Port: 26379/TCP
Host Port: 0/TCP
Command:
/bin/bash
Args:
-c
/opt/bitnami/scripts/start-scripts/start-sentinel.sh
State: Running
Started: Mon, 07 Feb 2022 16:02:53 +0000
Last State: Terminated
Reason: Unknown
Exit Code: 255
Started: Fri, 04 Feb 2022 18:08:34 +0000
Finished: Mon, 07 Feb 2022 16:01:53 +0000
Ready: True
Restart Count: 1
Liveness: exec [sh -c /health/ping_sentinel.sh 5] delay=20s timeout=5s period=5s #success=1 #failure=5
Readiness: exec [sh -c /health/ping_sentinel.sh 5] delay=20s timeout=1s period=5s #success=1 #failure=5
Environment:
BITNAMI_DEBUG: false
REDIS_PASSWORD: <set to the key 'redis-password' in secret 'redis'> Optional: false
REDIS_SENTINEL_TLS_ENABLED: no
REDIS_SENTINEL_PORT: 26379
Mounts:
/data from redis-data (rw)
/health from health (rw)
/opt/bitnami/redis-sentinel/etc from sentinel-tmp-conf (rw)
/opt/bitnami/redis-sentinel/mounted-etc from config (rw)
/opt/bitnami/scripts/start-scripts from start-scripts (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-sc754 (ro)
metrics:
Container ID: containerd://29f87b012ebab2413aba1a9cb6209f1142aa7eda10e7a4e5f128c0105322417c
Image: docker.io/bitnami/redis-exporter:1.33.0-debian-10-r27
Image ID: docker.io/bitnami/redis-exporter@sha256:a828ccc45a0542cf6066bf7487d168acdabac829a79d6d3e1aa95ca19b1fcfa0
Port: 9121/TCP
Host Port: 0/TCP
Command:
/bin/bash
-c
if [[ -f '/secrets/redis-password' ]]; then
export REDIS_PASSWORD=$(cat /secrets/redis-password)
fi
redis_exporter
State: Running
Started: Mon, 07 Feb 2022 16:03:01 +0000
Last State: Terminated
Reason: Unknown
Exit Code: 255
Started: Fri, 04 Feb 2022 18:08:34 +0000
Finished: Mon, 07 Feb 2022 16:01:51 +0000
Ready: True
Restart Count: 1
Environment:
REDIS_ALIAS: redis
REDIS_USER: default
REDIS_PASSWORD: <set to the key 'redis-password' in secret 'redis'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-sc754 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
redis-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: redis-data-redis-node-0
ReadOnly: false
start-scripts:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: redis-scripts
Optional: false
health:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: redis-health
Optional: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: redis-configuration
Optional: false
sentinel-tmp-conf:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
redis-tmp-conf:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
tmp:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-sc754:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
[root@server1 ~]#
Thank you for your feedback @ping2kpm, that may confirm my suspicion that the issue is caused by all pods starting simultaneously after Kubernetes is rebooted.
Since Redis is not a cloud-native application, the chart implements a method to choose a single master when using several replicas, but requires that a master is already up when the slave nodes initialize. We may need to make some changes to that method to prevent this scenario.
Ywc!, for time being, I have created one script to autofix it if server reboot. A cronjob required.
nodes=redis-node-0,redis-node-1,redis-node-2 for i in ${nodes//,/ } do MASTER_STATUS=$(kubectl -n redis exec -it $i -c redis -- bash -c 'redis-cli --no-auth-warning --raw -h $HOSTNAME -p $REDIS_SERVICE_PORT_TCP_REDIS -a $REDIS_PASSWORD info|grep role |cut -c1-11|cut -c6-'|dos2unix) export MASTER_STATUS if [ "master" = "$MASTER_STATUS" ]; then echo "Found the the $MASTER_STATUS & node name is $i, checking the replicas count" REPLICAS_COUNT=$(kubectl -n redis exec -it $i -c redis -- bash -c 'redis-cli --no-auth-warning -p 6379 -a $REDIS_PASSWORD info|grep connected_slaves|cut -d: -f2|cut -c1-1'|dos2unix) export REPLICAS_COUNT if [ "$REPLICAS_COUNT" -lt 2 ]; then echo "The Slave replica count is $REPLICAS_COUNT, proceeding for reform the cluster" kubectl -n redis delete pods redis-node-0; sleep 60 else echo "The Slave replica count is $REPLICAS_COUNT = 2, REDIS-CLUSTER WORKING AS EXPECTED, no action require, exiting " exit 0 fi else echo "it is a slave node and node name is $i" fi done
Thanks & Regards.
Since Redis is not a cloud-native application, the chart implements a method to choose a single master when using several replicas, but requires that a master is already up when the slave nodes initialize. We may need to make some changes to that method to prevent this scenario.
I'm not really sure what this means as cloud-native is just marketing fluff, but the way that we do this in our homegrown is to use a native k8s lock when initializing a new leader. The followers who lost the leader race should block until the leader is initialized. There are well trodden k8s apis for this.
I would strongly advise people not to use this in production, or at all. We began to re-evaluate this chart after having issues with it previously to see if things had improved. I have the utmost respect for the bitnami team, but this chart has never been functional, and really should not be published.
Hi @qeternity,
I'm sorry your experience with the Redis chart was not positive. Our team and the users who contribute to this chart, either by reporting issues or submitting PRs, try to continuously improve it.
Cloud-native may be used as marketing fluff most of the time, but in this case, I used it to refer to the issues we encounter because Redis is not designed to work in containerized environments.
To fix those issues, we have to create workarounds either in the chart or in the container logic. Some examples of issues those workarounds:
And many other things such as manually updating the cluster and shard balance each time the cluster is restarted and so on.
The impact of these design decisions may be very reduced when working in a VM cluster, where operations are performed manually by the cluster administrator, or some scenarios may be very infrequent such as network changes or the simultaneous restarts of several VMs.
Please be aware that these issues may exist when using the chart, and we appreciate you reporting them when possible. For those who like to get their hands dirty and would like to contribute, we will be very happy to review their PRs.
The feedback and contributions help us make this chart more stable, and of course, all the feedback shared with the Redis community will help replace custom workarounds with built-in features.
Hi @migruiz4,
Thanks for the reply, and thanks for all of Bitnami's work in general (we use a few other charts that we are very happy with and grateful form and certainly none that we feel entitled to). I have just pulled my hair out with this chart a handful of times, so if I seem exasperated, it's only because I would very much like to migrate our hand-rolled system into something like this.
Re: cloud native, totally fair and you're absolutely right - the Sentinel config is not conducive to service definitions and virtual ips. Part of our frustration in general with Redis is that HA and clustering have been developed as afterthoughts, and that becomes painfully evident in these circumstances. Memcache stands head and shoulders above in these regards.
I will definitely spend some more time putting this chart through its paces, and will hopefully have some fixes to upstream in the next month or so.
@qeternity we have the same problem as it seems and it really surprised our team that this is running so unstable. Did you find a better solution for this whole class of problems?
@h0jeZvgoxFepBQ2C we are pinned to an older version of the chart that we found to be stable, combined with an init container that takes a k8s lock to coordinate startups.
@qeternity could you share what the logic for the kubernetes lock is? We would like to evaluate it as it may make sense to add it to the chart, at least as an experimental feature.
Hi,
I created this PR https://github.com/bitnami/charts/pull/9282 that adds experimental support for persisting the sentinel.conf file. Thanks to this, this issue could be mitigated. I did not enable it by default as we would like to have first feedback from the community.
All input is appreciated!
@javsalgar just had a look over the PR - I think this is a better approach than our init container lock. I will deploy it to our dev cluster for some testing.
And how did it work out?
Which chart: https://github.com/bitnami/charts/tree/master/bitnami/redis:
Describe the bug
To Reproduce
I have used above mentioned chart, and configured (Redis + Sentinel) successfully, however whenever I reboot my server it gives different different errors.
POD:
Scenario 1:
In the below case the salves are not connected after server reboot.
Scenario 2:
In the below case the salves are not able to connect after server reboot.
redis.redis.svc.cluster.local:26379: Connection refused
Could anyone help me?
Thanks & Regards, Keshaba Mahapatra.