redis / redis

Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs, Bitmaps.
http://redis.io
Other
67.03k stars 23.8k forks source link

[BUG] sentinel report -failover-abort-not-elected with enough votes received #12865

Open Funkydream opened 11 months ago

Funkydream commented 11 months ago

Describe the bug

I've deployed 100 Redis groups with version 6.0.9 in 2 servers(1 master + 1 slave) and 5 sentinels configuration. When all masters in redis group suddenly disconnected from the network (they deployed in the same DC), this sentinel group began to try failover. I configured the following parameters for sentinels:

failover-timeout: 30000
down-after-milliseconds: 30000
quorum: 3

When I noticed that the recovery time was longer than expected, I checked the Sentinel logs. I found that some sentinels got a majority of votes, but still reported "-failover-abort-not-elected". Here is one sample.

log for 7c99908273ed894d926d86bf4fe998378e1a288d
29026:X 04 Dec 2023 13:21:25.883 # +odown master cluster_282 10.15.145.1 20564 #quorum 4/3
29026:X 04 Dec 2023 13:21:25.883 # +new-epoch 45
29026:X 04 Dec 2023 13:21:25.883 # +try-failover master cluster_282 10.15.145.1 20564
29026:X 04 Dec 2023 13:21:25.891 # +vote-for-leader 7c99908273ed894d926d86bf4fe998378e1a288d 45
29026:X 04 Dec 2023 13:21:25.990 # e83af7e16cb690c95d62ede6abe6515ecb5113de voted for 7c99908273ed894d926d86bf4fe998378e1a288d 45
29026:X 04 Dec 2023 13:21:26.265 # fdf9c299ac37b84d15c7be55d4d983c839905002 voted for 7c99908273ed894d926d86bf4fe998378e1a288d 45
29026:X 04 Dec 2023 13:21:37.594 # -failover-abort-not-elected master cluster_282 10.15.145.1 20564
29026:X 04 Dec 2023 13:21:37.594 # Next failover delay: I will not start a failover before Mon Dec 4 13:27:27 2023

To reproduce

I tried to actively disconnect the master DC network several times and found that this problem still exists.

Expected behavior

I can accept 2-3 failovers before successfully recovering, but getting enough votes to declare the election failed confuses me.The failover process should start when one sentinel got majority.

Additional information

I suspect this is due to rapidly growing epochs over a short period of time and I'm looking for evidence in the code.

moticless commented 10 months ago

Are you using containers? If so, maybe this fix will resolve your issue.

Funkydream commented 10 months ago

Are you using containers? If so, maybe this fix will resolve your issue.

No, all redis and sentinels run on virtual machines. I reproduced this problem using redis-sentinel version 6.2.14.

Funkydream commented 10 months ago

Are you using containers? If so, maybe this fix will resolve your issue.

I have a few ideas, please help me confirm them:

  1. only once vote for one epoch According to sentinelVoteLeader(), if the following voting results are obtained in the same epoch. Do all sentinels have to wait for an newer epoch, or wait for the next round of voting after the election-timeout (then 2*failover-timout) in the following situations ?

A voted for B 45 B voted for C 45 C voted for D 45 D voted for E 45 E voted for A 45

or A voted for A 45 B voted for B 45 C voted for C 45 D voted for D 45 E voted for E 45

char *sentinelVoteLeader(sentinelRedisInstance *master, uint64_t req_epoch, char *req_runid, uint64_t *leader_epoch) {
    if (req_epoch > sentinel.current_epoch) {
        sentinel.current_epoch = req_epoch;
        sentinelFlushConfig();
        sentinelEvent(LL_WARNING,"+new-epoch",master,"%llu",
            (unsigned long long) sentinel.current_epoch);
    }

    if (master->leader_epoch < req_epoch && sentinel.current_epoch <= req_epoch)
    {
        sdsfree(master->leader);
        master->leader = sdsnew(req_runid);
        master->leader_epoch = sentinel.current_epoch;
        sentinelFlushConfig();
        sentinelEvent(LL_WARNING,"+vote-for-leader",master,"%s %llu",
            master->leader, (unsigned long long) master->leader_epoch);
        /* If we did not voted for ourselves, set the master failover start
         * time to now, in order to force a delay before we can start a
         * failover for the same master. */
        if (strcasecmp(master->leader,sentinel.myid))
            master->failover_start_time = mstime()+rand()%SENTINEL_MAX_DESYNC;
    }

    *leader_epoch = master->leader_epoch;
    return master->leader ? sdsnew(master->leader) : NULL;
}
  1. After current_epoch increases, the old epoch votes have become invalid. After current_epoch increases, even if other sentinel has voted for me in the epoch I initiated, I will not recognize its voting results. Is it possible that in the following situation. E still won’t be elected even enough votes got?

E: master failover epoch = 45 current epoch =100 A voted for E 45 B voted for E 45 C voted for E 45 D voted for E 45

    /* Count other sentinels votes */
    di = dictGetIterator(master->sentinels);
    while((de = dictNext(di)) != NULL) {
        sentinelRedisInstance *ri = dictGetVal(de);
        if (ri->leader != NULL && ri->leader_epoch == sentinel.current_epoch)
            sentinelLeaderIncr(counters,ri->leader);
    }