how to handle restarts of redis server?

dcu commented 7 years ago

I'm using the pool package and when the redis server is restarted the pool clients stop working. is there a good way to handle disconnections?

If the pool doesn't handle this case, is there a way to remove the client and add a new one to the pool?

thanks

mediocregopher commented 7 years ago

Hey there! The radix pool is designed with an eye towards simplicity over features, so it doesn't do any active handling of situations like the server being restarted, but...

If the pool doesn't handle this case, is there a way to remove the client and add a new one to the pool?

...each connection keeps track of whether or not it's returned a network error when a command was attempted, and the pool will look at that when you call Put. If you Put a connection that's had a network error the pool will automatically clean it up and discard it, and later make a new one when necessary. So in general there's nothing specific you need to do to handle connections dying or instances restarting either, if you're ok with a couple errors happening. The pool will discard and create new connections as it needs to.

If there's long periods of time between usages of the pool you can do something like this just after the pool is created:

go func() {
    for {
        p.Cmd("PING")
        time.Sleep(1 * time.Second)
    }
}()

Which will test one connection per second in the pool, removing it if it's dead. If your pool has N connections, then N seconds after a server restart the connections should all be fresh. You can adjust the period to be less than a second as well, for a faster refresh period.

Another option, if you know when the redis server is going to restart, is to set up a signal handler in your app which will call Empty, so after the server has been restarted you can manually flush all of its connections.

And another option, depending on what commands you're running, would be to set up some retry logic on the redis commands in your app. It'd be pretty straightforward to write a wrapper around the Cmd method which would abstract this away.

So there's a few options, it really depends on your use-case and what trade-offs you want to make. In my code I usually just always assume an error can happen at every command and everything ends up working out.

Also I'm working on radix.v3 (literally at this moment) which will have better support for things like the Ping loop and retrying commands, but ultimately the options available will be more or less the same.

Hopefully this helps more than it confuses :P let me know if you need clarification on anything.

dcu commented 7 years ago

thanks for your prompt response @mediocregopher, really appreciated!

if you have a pool of 100 elements (for example) it takes a lot of time to recover with the pinging strategy. The retry strategy seems to be the same. how bad is the Empty() impact on performance? is it goroutine-safe ? I'm thinking about doing something like this:

go func() {
    for {
        r := redisPool.Cmd("PING")
        if r.Err != nil {
            redisPool.Empty()
        }

        time.Sleep(1 * time.Second)
    }
}()

mediocregopher commented 7 years ago

So under the hood the pool is really just a buffered channel of client. If the channel is empty when you go to Get a client a new one is made on the spot, and if the channel is full when you Put the client (or if the client has had a network error) then the client is closed and discarded.

Empty is thread-safe, and all it does is it pulls all clients out of the buffered channel and closes them. Which means the next time you call Get on the pool a new client will be made on the spot.

So performance-wise it really depends on your use-case. If you're dealing with 100's of requests a second then it's probably not the best strategy, but if it's only a few requests a second I think it would probably not be noticeable.

You also run the risk of emptying the Pool too often, again depending on your use-case. If you're in a situation where you may receive unexpected packet loss (e.g. if the redis instance is on a separate server than the application) then the Ping may fail even if the redis instance hasn't been restarted. If this happens often it could cause as many problems as it solves.

mediocregopher commented 7 years ago

Hey there, I hope my answers helped you out, I'm gonna go ahead and close this. If you have any more questions please feel free to re-open it and I can help you out more.

mediocregopher / radix.v2

how to handle restarts of redis server? #48