Open shenkonghui opened 3 years ago
测试了下failover功能是正常的,能恢复正常,但是数据好像丢失了(redis不保证强一致性,当复制落后并且发生切换数据是会丢失的),经测试复制功能故障.
重启该pod,恢复正常 日志如下
1:M 24 Aug 2021 10:46:48.838 * DB loaded from disk: 0.000 seconds
1:M 24 Aug 2021 10:46:48.838 * Before turning into a replica, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
1:M 24 Aug 2021 10:46:48.838 * Ready to accept connections
1:S 24 Aug 2021 10:46:48.839 * Discarding previously cached master state.
1:S 24 Aug 2021 10:46:48.839 * Before turning into a replica, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
1:S 24 Aug 2021 10:46:48.839 # Cluster state changed: ok
1:S 24 Aug 2021 10:46:49.843 * Connecting to MASTER 10.244.102.165:6379
1:S 24 Aug 2021 10:46:49.844 * MASTER <-> REPLICA sync started
1:S 24 Aug 2021 10:46:49.844 * Non blocking connect for SYNC fired the event.
1:S 24 Aug 2021 10:46:49.845 * Master replied to PING, replication can continue...
1:S 24 Aug 2021 10:46:49.846 * Trying a partial resynchronization (request cf007b56204d7eb04bd5a7a00b27c395b25f5235:267).
1:S 24 Aug 2021 10:46:49.847 * Full resync from master: 68f72905a9d0105cb9fa1edf001c97e9bce64bcb:0
1:S 24 Aug 2021 10:46:49.847 * Discarding previously cached master state.
1:S 24 Aug 2021 10:46:49.917 * MASTER <-> REPLICA sync: receiving 175 bytes from master
1:S 24 Aug 2021 10:46:49.917 * MASTER <-> REPLICA sync: Flushing old data
1:S 24 Aug 2021 10:46:49.917 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 24 Aug 2021 10:46:49.917 * MASTER <-> REPLICA sync: Finished with success
引起原因和 #80 一致,是master 和slave 同时重启导致的,采用meet命令可以让cluster nodes 状态恢复正常,但是部分会出现info replication 异常的情况
执行info replication看到的信息连接错误的master,status是down
但是执行cluster nodes 却看到所有的节点都是正常