redis / redis

Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs, Bitmaps.
http://redis.io
Other
66.32k stars 23.72k forks source link

[BUG] Instance with RDB enabled will block and timeout any client requests when shutting down #9566

Open eduardobr opened 2 years ago

eduardobr commented 2 years ago

When an instance with RDB persistence enabled is shutting down it will block and timeout any client requests. A dataset with between 1 and 2GBs is enough to keep the instance serving timeouts for about 14 secs in a setup I have with pretty decent storage. That's hard to handle from client side because it doesn't know it needs to reconnect to a different replica. Timeouts could be genuine during the lifetime of the instance or be a shutdown, which we know we want to move to another replica in this case.

Health probes on the instance will detect it's timing out, but that can't be immediately, meaning a period of unavailability.

Expected behavior

At least keep serving reads on read-only replicas or kill all clients (and don't allow reconnection) so they try to reconnect to a different replica (not sure about master/replicas connected)

Additional information

On server.c : prepareForShutdown() we can see this behavior:

    /* Create a new RDB file before exiting. */
    if ((server.saveparamslen > 0 && !nosave) || save) {
        serverLog(LL_NOTICE,"Saving the final RDB snapshot before exiting.");
        if (server.supervised_mode == SUPERVISED_SYSTEMD)
            redisCommunicateSystemd("STATUS=Saving the final RDB snapshot\n");
        /* Snapshotting. Perform a SYNC SAVE and exit */
        rdbSaveInfo rsi, *rsiptr;
        rsiptr = rdbPopulateSaveInfo(&rsi);
        if (rdbSave(server.rdb_filename,rsiptr) != C_OK) {
            /* Ooops.. error saving! The best we can do is to continue
             * operating. Note that if there was a background saving process,
             * in the next cron() Redis will be notified that the background
             * saving aborted, handling special stuff like slaves pending for
             * synchronization... */
            serverLog(LL_WARNING,"Error trying to save the DB, can't exit.");
            if (server.supervised_mode == SUPERVISED_SYSTEMD)
                redisCommunicateSystemd("STATUS=Error trying to save the DB, can't exit.\n");
            return C_ERR;
        }
    }
zuiderkwast commented 2 years ago

Related to #9693