Open Pride1st1 opened 5 months ago
+1
I've had this too.
I've found that when we've added a descheduler to the stack (https://github.com/kubernetes-sigs/descheduler) to balance nodes automatically, this kind of issue will disable the redis service frequently.
Can the master allocation be done with kubernetes lease locks? https://kubernetes.io/docs/concepts/architecture/leases/
@tschirmer I'm trying to work out why this would happen unless the podManagementPolicy
of the STS is set to Parallel?
Is this happening in either of your cases? @tschirmer ??
Because in theory, on first rollout, the first pod should start up and become master, way before -1/-2 start.
@DandyDeveloper Hi I'm having problem when my network becomes a bit unstable (for example pods are not able to each other for a sec.) and my redis pods can't see each other
@tschirmer I'm trying to work out why this would happen unless the
podManagementPolicy
of the STS is set to Parallel?Is this happening in either of your cases? @tschirmer ??
Because in theory, on first rollout, the first pod should start up and become master, way before -1/-2 start.
Haven't set it to Parallel. I suspect it would be something like, pod when evicted isn't completing the trigger-failover-if-master.sh. We are running it with sentinel, which might add some complexity here. I haven't debugged it yet.
So far we're getting a load of issues with the liveness probe not containing the SENTINELAUTH env from the secret, but it's clearly defined in the spec; and a restart of the pod works. It's happening very frequently though, so I'm wondering if there needs to be a grace period defined on startup and shutdown to prevent it both of these things from happening
I think being able to have separated Statefulsets for redises and sentinels will make this chart more stable and manageable,
By creating two Statefulsets and giving sentinel monitor
config to monitor an external host
I like the idea of seperate stateful sets, I've been thinking of doing that and making a PR
I suspect this is from preStop hooks not firing and completely successfully. trigger-failover-if-master.sh occasionally doesn't run as expected. When we had the descheduler running it was ~2min between turning on and off each pod, and found that every now and again, that would fail. The rate of failure is low, so it's unlikely occur unless you're hammering it (we haven't had an issue with the ah cluster once we turned off the descheduler.
I wanted to make a PR too. But there are a lot of configs that should propagate this change
I found that there were a couple things wrong with my setup:
The permissions were the killer, because nothing was failing over on shutdown.
I'm half way through writing a leader elector in golang for this based on k8s leases. Got it claiming the lease already. I'm not sure it's totally necessary after we've solved these other issues though.
specifically. In the stateful set the volume definitions here: from:
volumes:
- configMap:
defaultMode: 420 ####THIS ONE ensured that the preStop Hooks didn't have the permissions to run. Changed it to 430
name: redis-session-configmap
name: config
- hostPath:
path: /sys
type: ''
name: host-sys
- configMap:
defaultMode: 493
name: redis-session-health-configmap
name: health
to:
volumes:
- configMap:
defaultMode: 430 ####THIS ONE ensured that the preStop Hooks didn't have the permissions to run. Changed it to 430
name: redis-session-configmap
name: config
- hostPath:
path: /sys
type: ''
name: host-sys
- configMap:
defaultMode: 493
name: redis-session-health-configmap
name: health
Also found that that preStopHook: /readonly-config/..data/trigger-failover-if-master.sh
requires SENTINELAUTH, but it's not defined in the env for the redis container
echo "[K8S PreStop Hook] Start Failover."
get_redis_role() {
is_master=$(
redis-cli \
-a "${AUTH}" --no-auth-warning \
-h localhost \
-p 6379 \
info | grep -c 'role:master' || true
)
}
get_redis_role
echo "[K8S PreStop Hook] Got redis role."
if [[ "$is_master" -eq 1 ]]; then
echo "[K8S PreStop Hook] This node is currently master, we trigger a failover."
response=$(
redis-cli \
-a "${SENTINELAUTH}" --no-auth-warning \
-h 127.0.0.1 \
-p 26379 \
SENTINEL failover mymaster
)
if [[ "$response" != "OK" ]] ; then
echo "[K8S PreStop Hook] Failover failed"
echo "$response"
exit 1
fi
timeout=30
while [[ "$is_master" -eq 1 && $timeout -gt 0 ]]; do
sleep 1
get_redis_role
timeout=$((timeout - 1))
done
echo "[K8S PreStop Hook] Failover successful"
else
echo "[K8S PreStop Hook] This node is currently replica, no failover needed."
fi
^I'd modified the above so I could get some debug data. Along with this in the stateful set:
lifecycle:
preStop:
exec:
command:
- /bin/sh
- '-c'
- >-
echo "running preStop" >> /proc/1/fd/1 &&
/readonly-config/trigger-failover-if-master.sh | tee >>
/proc/1/fd/1 && echo "finished preStop" >> /proc/1/fd/1
the >> /proc/1/fd/1
forces this output in the container log in k8s
Found that running preStops would consistently fail.
running preStop
[K8S PreStop Hook] Start Failover.
[K8S PreStop Hook] Got redis role.
[K8S PreStop Hook] This node is currently master, we trigger a failover.
[K8S PreStop Hook] Failover failed
finished preStop
Found that the Sentinel container had shut down before the command could be executed on the localhost., so it kept getting a failover failed. Changed the sentinel preStop to add in a 10 sec delay to keep it alive while this happened and it seems to work every time now.
lifecycle:
preStop:
exec:
command:
- /bin/sh
- '-c'
- >-
sleep 10
Found that running preStops would consistently fail.
running preStop [K8S PreStop Hook] Start Failover. [K8S PreStop Hook] Got redis role. [K8S PreStop Hook] This node is currently master, we trigger a failover. [K8S PreStop Hook] Failover failed finished preStop
Found that the Sentinel container had shut down before the command could be executed on the localhost., so it kept getting a failover failed. Changed the sentinel preStop to add in a 10 sec delay to keep it alive while this happened and it seems to work every time now.
lifecycle: preStop: exec: command: - /bin/sh - '-c' - >- sleep 10
While this "might" work, it "may not" be consistent, suggest taking a look at my solution instead here: https://github.com/DandyDeveloper/charts/issues/207#issuecomment-1827134022
Describe the bug I deployed the chart with default values. During its explatation we met condition when redis-0 and redis-2 are replicas of redis-1, and redis-1 is replica of redis-0. The split-brain-fix container wasn`t able to fix the problem.
172.20.75.109 - redis-0 172.20.181.236 - redis-1 172.20.198.17 - redis-2
redis-0:
redis-1 (sentinel tries to restart it):
sentinel-1 (leader)
split-brain-fix-1
split-brain-fix-0
To Reproduce I tried node/pod deletion and redis-cli replicaof with no success to reproduce this bug
Expected behavior split-brain-fix container should fix even this rare case
Additional context The scripts logic was broken by inability of sentinel to failover. Maybe script should have additional condition to check the role of potential default master. I will be very apreatiate for any help with this. Please let me know if you need some additional logs/checks