Reconnects broken since 1.4

dim commented 10 years ago

In our staging setup, we have a single sentinel running. On the first redis timeout/disconnect, we start to receive continuous Redis::CannotConnectError errors. The client doesn't seem to reconnect an the the persist on every redis call until we restart the app.

I think there is a problem with try_next_sentinel, as it shifts from the list of @sentinel_options, never returning them back to the array. I didn't dive into the code, but shouldn't the method look something like this:

def try_next_sentinel
  sentinel_options = @sentinels_options.shift
  if sentinel_options
    @logger.debug "Trying next sentinel: #{sentinel_options[:host]}:#{sentinel_options[:port]}" if @logger && @logger.debug?
    @current_sentinel_options = sentinel_options
    @current_sentinel = Redis.new sentinel_options
  else
    raise Redis::CannotConnectError
  end
ensure
   @sentinels_options.push(sentinel_options) if sentinel_options
end

Full backtrace:

redis-sentinel-1.4.0/lib/redis-sentinel/client.rb:61→ try_next_sentinel
redis-sentinel-1.4.0/lib/redis-sentinel/client.rb:74→ discover_master
redis-sentinel-1.4.0/lib/redis-sentinel/client.rb:28→ block in connect_with_sentinel
redis-sentinel-1.4.0/lib/redis-sentinel/client.rb:46→ call
redis-sentinel-1.4.0/lib/redis-sentinel/client.rb:46→ auto_retry_with_timeout
redis-sentinel-1.4.0/lib/redis-sentinel/client.rb:27→ connect_with_sentinel
redis-3.0.6/lib/redis/client.rb:292→ ensure_connected

dim commented 10 years ago

Or you could keep a clone of your @sentinels_options as @original_sentinels_options and do:

def try_next_sentinel
  sentinel_options = @sentinels_options.shift
  if sentinel_options
    # ...
  else
    @sentinels_options = @original_sentinel_options.clone
    raise Redis::CannotConnectError
  end
end

flyerhzm commented 10 years ago

@dim I have fixed in HEAD code, please try and let me know if it works

dim commented 10 years ago

Hmm, it works but I am a little concerned about the logic in the approach:

def try_next_sentinel
  sentinel_options = @sentinels_options.shift
  @sentinels_options.push sentinel_options
  if sentinel_options
    ...
end

Because you are pushing the options straight back into the array, sentinel_options can never be nil, so you can safely reduce the method down to:

def try_next_sentinel
  sentinel_options = @sentinels_options.shift
  @sentinels_options.push sentinel_options
  @logger.debug "Trying next sentinel: #{sentinel_options[:host]}:#{sentinel_options[:port]}" if @logger && @logger.debug?
  @current_sentinel = Redis.new sentinel_options
end

flyerhzm commented 10 years ago

@dim thanks, I have update the code and released 1.4.1 version.

dim commented 10 years ago

Hmm, sorry, we made a mistake while testing, seems like you should yank 1.4.1. Will send a patch request shortly.

flyerhzm / redis-sentinel

Reconnects broken since 1.4 #33