No route to host error after Redis master re-spawned

sstaw commented 4 years ago

Hi there,

I am having some problems when using the Redis chart with Sentinel mode enabled. The start up of the chart works like a charm. However, the Redis slaves are not able to re-connect to the newly created Redis master after the Redis master crashed and re-spawned.

From the log, I can see that Redis slaves are still trying to connect to the Redis master using the IP address of the old pod instead of the new pod. I have looked through all the configs again, and cannot see anything around this area.

The README states the slaves would be able to reconnect to the master. Would you be able to explain how exactly that works? How would the Redis slave pods know the IP address of the new Redis master pod?

Cheers, Alex

alemorcuq commented 4 years ago

Hi, Alex.

On each slave we define the environment variable REDIS_MASTER_HOST that tells the slaves where the master is. You can see in the slave stateful set manifest that it is set to the master pod in the master stateful set:

https://github.com/bitnami/charts/blob/c647a50ab115fe8f644c628b6efc434d20dd0ed6/upstreamed/redis/templates/redis-slave-statefulset.yaml#L129

Being a stateful set, the network identifier is always the same, so when a new master pod is created, it will have the same network identifier.

I've tried to reproduce your issue but I wasn't able to. This is what I did:

Deploy the chart with cluster and sentinel enabled:

$ helm install stable/redis --name amorenoredis --namespace amorenoredis --set cluster.enabled=true,sentinel.enabled=true

Delete the master pod so another one gets created:

$ kubectl delete pod/amorenoredis-master-0 -n amorenoredis

Looking at the logs of one of the slaves, you can see that the connection is lost and then recovered:

$ kubectl logs -n amorenoredis amorenoredis-slave-1 -c amorenoredis
 12:20:57.19 INFO  ==> ** Starting Redis **
1:C 30 Oct 2019 12:20:57.204 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 30 Oct 2019 12:20:57.204 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 30 Oct 2019 12:20:57.204 # Configuration loaded
1:S 30 Oct 2019 12:20:57.211 * Running mode=standalone, port=6379.
1:S 30 Oct 2019 12:20:57.221 # Server initialized
1:S 30 Oct 2019 12:20:57.221 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:S 30 Oct 2019 12:20:57.221 * Ready to accept connections
1:S 30 Oct 2019 12:20:57.222 * Connecting to MASTER amorenoredis-master-0.amorenoredis-headless.amorenoredis.svc.cluster.local:6379
1:S 30 Oct 2019 12:20:57.229 * MASTER <-> REPLICA sync started
1:S 30 Oct 2019 12:20:57.229 * Non blocking connect for SYNC fired the event.
1:S 30 Oct 2019 12:20:57.229 * Master replied to PING, replication can continue...
1:S 30 Oct 2019 12:20:57.235 * Partial resynchronization not possible (no cached master)
1:S 30 Oct 2019 12:20:57.236 * Full resync from master: 1c596d0fda9aab52a36d59ccba6fbcc11abb3846:4314
1:S 30 Oct 2019 12:20:57.407 * MASTER <-> REPLICA sync: receiving 176 bytes from master
1:S 30 Oct 2019 12:20:57.407 * MASTER <-> REPLICA sync: Flushing old data
1:S 30 Oct 2019 12:20:57.412 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 30 Oct 2019 12:20:57.412 * MASTER <-> REPLICA sync: Finished with success
1:S 30 Oct 2019 12:20:57.413 * Background append only file rewriting started by pid 16
1:S 30 Oct 2019 12:20:57.830 * AOF rewrite child asks to stop sending diffs.
16:C 30 Oct 2019 12:20:57.830 * Parent agreed to stop sending diffs. Finalizing AOF...
16:C 30 Oct 2019 12:20:57.831 * Concatenating 0.00 MB of AOF diff received from parent.
16:C 30 Oct 2019 12:20:57.831 * SYNC append only file rewrite performed
16:C 30 Oct 2019 12:20:57.831 * AOF rewrite: 0 MB of memory used by copy-on-write
1:S 30 Oct 2019 12:20:57.876 * Background AOF rewrite terminated with success
1:S 30 Oct 2019 12:20:57.876 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 30 Oct 2019 12:20:57.877 * Background AOF rewrite finished successfully
1:S 30 Oct 2019 12:21:18.019 # Connection with master lost.
1:S 30 Oct 2019 12:21:18.019 * Caching the disconnected master state.
1:S 30 Oct 2019 12:21:18.019 * REPLICAOF 10.30.1.248:6379 enabled (user request from 'id=4 addr=10.30.1.250:42659 fd=8 name=sentinel-69ead495-cmd age=20 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=151 qbuf-free=32617 obl=36 oll=0 omem=0 events=r cmd=exec')
1:S 30 Oct 2019 12:21:18.020 # CONFIG REWRITE executed with success.
1:S 30 Oct 2019 12:21:19.043 * Connecting to MASTER 10.30.1.248:6379
1:S 30 Oct 2019 12:21:19.045 * MASTER <-> REPLICA sync started
1:S 30 Oct 2019 12:21:19.045 * Non blocking connect for SYNC fired the event.
1:S 30 Oct 2019 12:21:19.047 * Master replied to PING, replication can continue...
1:S 30 Oct 2019 12:21:19.052 * Trying a partial resynchronization (request 1c596d0fda9aab52a36d59ccba6fbcc11abb3846:8738).
1:S 30 Oct 2019 12:21:19.053 * Successful partial resynchronization with master.
1:S 30 Oct 2019 12:21:19.053 * MASTER <-> REPLICA sync: Master accepted a Partial Resynchronization.
1:S 30 Oct 2019 12:23:28.089 # Connection with master lost.
1:S 30 Oct 2019 12:23:28.089 * Caching the disconnected master state.
1:S 30 Oct 2019 12:23:28.537 * Connecting to MASTER 10.30.1.248:6379
1:S 30 Oct 2019 12:23:28.537 * MASTER <-> REPLICA sync started
1:M 30 Oct 2019 12:24:28.504 # Setting secondary replication ID to 1c596d0fda9aab52a36d59ccba6fbcc11abb3846, valid up to offset: 34887. New replication ID is 74c371c84bfa9cc9d5b6d01892cd83e2281c1aa0
1:M 30 Oct 2019 12:24:28.505 * Discarding previously cached master state.
1:M 30 Oct 2019 12:24:28.505 * MASTER MODE enabled (user request from 'id=13 addr=10.30.3.68:58937 fd=11 name=sentinel-9ebeefd1-cmd age=191 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=140 qbuf-free=32628 obl=36 oll=0 omem=0 events=r cmd=exec')
1:M 30 Oct 2019 12:24:28.505 # CONFIG REWRITE executed with success.
1:M 30 Oct 2019 12:24:29.234 * Replica 10.30.3.68:6379 asks for synchronization
1:M 30 Oct 2019 12:24:29.234 * Partial resynchronization request from 10.30.3.68:6379 accepted. Sending 159 bytes of backlog starting from offset 34887.
1:M 30 Oct 2019 12:24:41.292 * Replica 10.30.6.78:6379 asks for synchronization
1:M 30 Oct 2019 12:24:41.292 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'be647759fa550655ac722fda2acc5e7630ec185f', my replication IDs are '74c371c84bfa9cc9d5b6d01892cd83e2281c1aa0' and '1c596d0fda9aab52a36d59ccba6fbcc11abb3846')
1:M 30 Oct 2019 12:24:41.292 * Starting BGSAVE for SYNC with target: disk
1:M 30 Oct 2019 12:24:41.293 * Background saving started by pid 352
352:C 30 Oct 2019 12:24:41.301 * DB saved on disk
352:C 30 Oct 2019 12:24:41.302 * RDB: 0 MB of memory used by copy-on-write
1:M 30 Oct 2019 12:24:41.393 * Background saving terminated with success
1:M 30 Oct 2019 12:24:41.393 * Synchronization with replica 10.30.6.78:6379 succeeded

Regards, Alejandro

stale[bot] commented 4 years ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

stale[bot] commented 4 years ago

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

dntosas commented 4 years ago

we got the same behaviour and investigating ^^ @sstaw feel free to re-open the issue

failover works as expected. but there seems to be some RC that sentinel caches old endpoint IP and doesnt communicate with the other Sentinels to get refreshed IP mapping

dntosas commented 4 years ago

@alemorcuq the problem is that when a master fails, new elected master can be from the pool of "slaves-stateful-set"

alemorcuq commented 4 years ago

Hi, @dntosas.

Could you share more details about your investigation and the issue itself? Unfortunately I'm not a Redis expert so that would help me understand the problem.

Regards, Alejandro

dntosas commented 4 years ago

hello @alemorcuq

i think this issue is related to this one #19059 which is resolved recently

alemorcuq commented 4 years ago

hello @alemorcuq

i think this issue is related to this one #19059 which is resolved recently

Does enabling the sentinel.staticID parameter fix the issue for you?

dntosas commented 4 years ago

@alemorcuq it seems so, 5 days now and still cant reproduce this error ^

alemorcuq commented 4 years ago

Thank you very much for letting us know, @dntosas !

stale[bot] commented 4 years ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

alemorcuq commented 4 years ago

Closing as solved.

elucidsoft commented 4 years ago

This is still an issue I just had this happen. It's very rare but when it happens it's catastrophic and it doesn't recover from it on its own.

dani8art commented 4 years ago

Hi @elucidsoft , could you share with us what you did so we can try to reproduce the issue?

weisdd commented 3 years ago

@dani8art I can easily reproduce it in a microk8s installation. - When I restart my laptop, redis-master-0 gets assigned another IP address, and slaves keep on trying to connect to the old one. Setting a staticID (https://github.com/helm/charts/pull/19059) did not help. The only way to recover is to delete slave pods.

Chart 10.7.12, Redis 6.0.6-debian-10-r9.

(microk8s:redis)➜  ~ k logs redis-slave-0 -c redis | tail -n 10
1:S 10 Sep 2020 13:31:32.223 * MASTER <-> REPLICA sync started
1:S 10 Sep 2020 13:31:34.942 # Error condition on socket for SYNC: Operation now in progress
1:S 10 Sep 2020 13:31:35.232 * Connecting to MASTER 10.1.77.110:6379
1:S 10 Sep 2020 13:31:35.232 * MASTER <-> REPLICA sync started
1:S 10 Sep 2020 13:31:38.014 # Error condition on socket for SYNC: Resource temporarily unavailable
1:S 10 Sep 2020 13:31:38.240 * Connecting to MASTER 10.1.77.110:6379
1:S 10 Sep 2020 13:31:38.240 * MASTER <-> REPLICA sync started
1:S 10 Sep 2020 13:31:41.086 # Error condition on socket for SYNC: Resource temporarily unavailable
1:S 10 Sep 2020 13:31:41.248 * Connecting to MASTER 10.1.77.110:6379
1:S 10 Sep 2020 13:31:41.248 * MASTER <-> REPLICA sync started
(microk8s:redis)➜  ~ k logs redis-slave-1 -c redis | tail -n 10
1:S 10 Sep 2020 13:31:34.213 * MASTER <-> REPLICA sync started
1:S 10 Sep 2020 13:31:36.894 # Error condition on socket for SYNC: Resource temporarily unavailable
1:S 10 Sep 2020 13:31:37.220 * Connecting to MASTER 10.1.77.110:6379
1:S 10 Sep 2020 13:31:37.220 * MASTER <-> REPLICA sync started
1:S 10 Sep 2020 13:31:39.966 # Error condition on socket for SYNC: Operation now in progress
1:S 10 Sep 2020 13:31:40.229 * Connecting to MASTER 10.1.77.110:6379
1:S 10 Sep 2020 13:31:40.229 * MASTER <-> REPLICA sync started
1:S 10 Sep 2020 13:31:43.038 # Error condition on socket for SYNC: Resource temporarily unavailable
1:S 10 Sep 2020 13:31:43.238 * Connecting to MASTER 10.1.77.110:6379
1:S 10 Sep 2020 13:31:43.238 * MASTER <-> REPLICA sync started
(microk8s:redis)➜  ~ k get po -o wide
NAME             READY   STATUS    RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
redis-master-0   3/3     Running   0          9m37s   10.1.77.139   weis-rmb   <none>           <none>
redis-slave-0    3/3     Running   3          20m     10.1.77.118   weis-rmb   <none>           <none>
redis-slave-1    3/3     Running   5          20m     10.1.77.132   weis-rmb   <none>           <none>

Before slave-0 restart (kubectl delete pod):

I have no name!@redis-master-0:/opt/bitnami/redis-sentinel/etc$ cat sentinel.conf 
dir "/tmp"
bind 0.0.0.0
port 26379
sentinel monitor mymaster redis-master-0.redis-headless.redis.svc.cluster.local 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 18000
sentinel parallel-syncs mymaster 1
sentinel auth-pass mymaster sRxCaaCDx379Rx7n
sentinel myid 23b3777dfee5a0ffc1ffbf21eb6c2fb606a0f99d
sentinel known-sentinel mymaster 10.1.77.110 26379 23b3777dfee5a0ffc1ffbf21eb6c2fb606a0f99d
sentinel known-sentinel mymaster 10.1.77.112 26379 032076aa2dd653c733d24b41a26df5fb59af4f9eI have no name!@redis-master-0:/opt/bitnami/redis-sentinel/etc$ 

I have no name!@redis-slave-0:/opt/bitnami/redis-sentinel/etc$ cat sentinel.conf 
dir "/tmp"
bind 0.0.0.0
port 26379
sentinel myid 032076aa2dd653c733d24b41a26df5fb59af4f9e
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 10.1.77.110 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 18000
sentinel auth-pass mymaster sRxCaaCDx379Rx7n
\# Generated by CONFIG REWRITE
protected-mode no
user default on nopass ~* +@all
sentinel config-epoch mymaster 0
sentinel leader-epoch mymaster 0
sentinel known-replica mymaster 10.1.77.112 6379
sentinel known-replica mymaster 10.1.77.111 6379
sentinel known-sentinel mymaster 10.1.77.110 26379 23b3777dfee5a0ffc1ffbf21eb6c2fb606a0f99d
sentinel known-sentinel mymaster 10.1.77.111 26379 ee6c12bd3a626c188f4239d2f9af1cdca9f2c315
sentinel current-epoch 0

After restart:

I have no name!@redis-master-0:/opt/bitnami/redis-sentinel/etc$ cat sentinel.conf 
dir "/tmp"
bind 0.0.0.0
port 26379
sentinel myid 23b3777dfee5a0ffc1ffbf21eb6c2fb606a0f99d
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 10.1.77.140 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 18000
sentinel auth-pass mymaster sRxCaaCDx379Rx7n
sentinel config-epoch mymaster 0
sentinel leader-epoch mymaster 0
\# Generated by CONFIG REWRITE
protected-mode no
user default on nopass ~* +@all
sentinel known-replica mymaster 10.1.77.141 6379
sentinel known-sentinel mymaster 10.1.77.141 26379 032076aa2dd653c733d24b41a26df5fb59af4f9e
sentinel known-sentinel mymaster 10.1.77.110 26379 23b3777dfee5a0ffc1ffbf21eb6c2fb606a0f99d
sentinel current-epoch 0

I have no name!@redis-slave-0:/$ cat /opt/bitnami/redis-sentinel/etc/sentinel.conf 
dir "/tmp"
bind 0.0.0.0
port 26379
sentinel myid 032076aa2dd653c733d24b41a26df5fb59af4f9e
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 10.1.77.140 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 18000
sentinel auth-pass mymaster sRxCaaCDx379Rx7n
\# Generated by CONFIG REWRITE
protected-mode no
user default on nopass ~* +@all
sentinel config-epoch mymaster 0
sentinel leader-epoch mymaster 0
sentinel known-replica mymaster 10.1.77.141 6379
sentinel known-sentinel mymaster 10.1.77.140 26379 23b3777dfee5a0ffc1ffbf21eb6c2fb606a0f99d
sentinel current-epoch 0

(microk8s:redis)➜  ~ k get po -o wide
NAME             READY   STATUS    RESTARTS   AGE    IP            NODE       NOMINATED NODE   READINESS GATES
redis-master-0   3/3     Running   0          19m    10.1.77.140   weis-rmb   <none>           <none>
redis-slave-0    3/3     Running   0          8m7s   10.1.77.141   weis-rmb   <none>           <none>
redis-slave-1    3/3     Running   5          44m    10.1.77.132   weis-rmb   <none>           <none>

dani8art commented 3 years ago

When I restart my laptop, redis-master-0 gets assigned another IP address, and slaves keep on trying to connect to the old one.

I've tried to reproduce it but I couldn't, even if my master pod gets assigned another IP it still works

$ kubectl get pods -o wide
NAME             READY   STATUS    RESTARTS   AGE     IP            NODE                                            NOMINATED NODE   READINESS GATES
redis-master-0   2/2     Running   0          98s     10.48.2.118   darteaga-tests-default-pool-c3dd3f10-xdfd   <none>           <none>
redis-slave-0    2/2     Running   2          7m9s    10.48.0.236   darteaga-tests-default-pool-c3dd3f10-pm16   <none>           <none>
redis-slave-1    2/2     Running   0          6m23s   10.48.3.85    darteaga-tests-default-pool-c3dd3f10-xf66   <none>           <none>

 kubectl get pods -o wide
NAME             READY   STATUS    RESTARTS   AGE     IP            NODE                                            NOMINATED NODE   READINESS GATES
redis-master-0   2/2     Running   0          2m4s    10.48.2.119   darteaga-tests-default-pool-c3dd3f10-xdfd   <none>           <none>
redis-slave-0    2/2     Running   2          9m22s   10.48.0.236   darteaga-tests-default-pool-c3dd3f10-pm16   <none>           <none>
redis-slave-1    2/2     Running   0          8m36s   10.48.3.85    darteaga-tests-default-pool-c3dd3f10-xf66   <none>           <none>

weisdd commented 3 years ago

@dani8art Your environment is different :) If I just delete master pod, slaves will pick up the new IP just fine. But if I restart my laptop (meaning the whole cluster goes down as I use single-node microk8s), slaves will never recover unless I delete them. It happens every single time. I think it should be reproducible in multi-node environment as well, just with different steps.

image:
  tag: 6.0.6-debian-10-r9

sentinel:
  enabled: true
  usePassword: false
  staticID: true
  image:
    tag: 6.0.6-debian-10-r11

serviceAccount:
  create: true

rbac:
  create: true

dani8art commented 3 years ago

Hi @weisdd I've also tried to restart all of them together and it worked also.

$ kubectl delete pod redis-master-0 redis-slave-0 redis-slave-1
pod "redis-master-0" deleted
pod "redis-slave-0" deleted
pod "redis-slave-1" deleted

$ kubectl get pods -w
NAME             READY   STATUS              RESTARTS   AGE
redis-master-0   0/2     ContainerCreating   0          3s
redis-slave-0    0/2     ContainerCreating   0          3s
redis-slave-0    0/2     Error               0          11s
redis-slave-0    0/2     Error               1          12s
redis-master-0   0/2     Running             0          14s
redis-slave-0    0/2     CrashLoopBackOff    1          14s
redis-master-0   1/2     Running             0          19s
redis-master-0   2/2     Running             0          20s
redis-slave-0    1/2     CrashLoopBackOff    1          24s
redis-slave-0    1/2     Running             2          29s
redis-slave-0    2/2     Running             2          37s
redis-slave-1    0/2     Pending             0          0s
redis-slave-1    0/2     Pending             0          0s
redis-slave-1    0/2     ContainerCreating   0          0s
redis-slave-1    0/2     Running             0          11s
redis-slave-1    1/2     Running             0          17s
redis-slave-1    2/2     Running             0          20s

$ kubectl get pods
NAME             READY   STATUS    RESTARTS   AGE
redis-master-0   2/2     Running   0          67s
redis-slave-0    2/2     Running   2          67s
redis-slave-1    2/2     Running   0          30s

We released recently a new version of this chart please try it with the latest one.

$ helm repo update

weisdd commented 3 years ago

Hi @dani8art,

I've just tried the latest chart, the behaviour is the same - slaves keep on trying to connect to the old master's IP. This time I did:

microk8s.stop
microk8s.start

weisdd commented 3 years ago

@dani8art Yesterday, we decided to use Redis with Gravitee APIM3. The latter doesn't support sentinel / cluster, so I was looking for a universal solution with HAProxy + Sentinel out-of-the-box, and came across another chart: dandydev/redis-ha. What's interesting about that implementation is that pods use ClusterIP services for synchronization: Since services retain their IP addresses until you recreate them, there's no such problem as a stale master IP. - Whatever happens to the pods and to the cluster, they recover every single time. I think it's a much better solution than StaticIDs. :)

dani8art commented 3 years ago

Hi @weisdd thanks for your suggestion,

BTW, we are doing big changes in our redis chart, please take a look at the PR it might be interesting and could fill your requirements.

https://github.com/bitnami/charts/pull/3658

elucidsoft commented 3 years ago

I just had this happen again in production using latest everything. I don't understand what's causing it but it seems related to cluster restarts.

dani8art commented 3 years ago

Hi @elucidsoft, could you share info about your deployment, please?

elucidsoft commented 3 years ago

It's sentinel based, staticID enabled. The rest is the default configuration. It's also not repeatable, I have attempted to force it to happen by deleting the pods manually. I tried this for hours and it always recovered. Every single time. I tried restarting the nodes and still it recovered. I have gone 4 months without this happening, but I just had it happen twice in two weeks as well.

What's happening is EXACTLY what the original poster of this issue explains. I believe what's causing it that I'm on GKE and Google will automatically upgrade my nodes and cluster. When they do this, it's when it happens. The nodes get rebooted in a rolling type style. 1, 1 come back fully, 2, 2 comes back fully, 3. I have my master set to be on one node, and both my slaves to be on their own using taint policies. So each master and slave is always on a separate node.

dani8art commented 3 years ago

Hi @elucidsoft thanks for the info it seems to be very related to https://github.com/bitnami/charts/issues/5181, we will check if we can reproduce it in our local environments and figure out the issue.

elucidsoft commented 3 years ago

I think that issue is what I'm experiencing yes...

dani8art commented 3 years ago

Hi,

We have opened an internal task in order to investigate a bit more about this error, unfortunately, we can not give you an ETA.

BTW, any action or research results are welcome if you are performing more tests on this.

Thanks for the feedback!

rafariossaa commented 3 years ago

Hi, A new version of the chart was released. Could you give it a try and check if this fixed the issue for you ?

bitnami / charts

No route to host error after Redis master re-spawned #1544