redis-rb / redis-cluster-client

Redis cluster-aware client for Ruby
https://rubygems.org/gems/redis-cluster-client
MIT License
21 stars 9 forks source link

Connection Errors when Cluster Topology Changes #404

Closed kasra-sheik closed 2 weeks ago

kasra-sheik commented 2 weeks ago

I have an AWS Memory DB cluster that runs clustered Redis. The topology looks something similar to 2 shards each with 2 nodes.

AWS claims they can do online resharding so I recently brought down one of the shards and every client connection raised a cannotConnectError whenever a command was executed on that connection even long after the shard went offline. I would have expected the client to gracefully handle this on connection termination and after retrying the connection, they would get proxied to an online node to run CLUSTER NODES on.

Here is my config + stack trace. Any guidance or thoughts here?

        Redis::Cluster.new(
          nodes:<cluster_url>,
          reconnect_attempts: 3,
          timeout: 1,
          slow_command_timeout: 3,
          connect_with_original_config: true
        )
/usr/local/lib/ruby/3.2.0/socket.rb:231:in `getaddrinfo'
/usr/local/lib/ruby/3.2.0/socket.rb:231:in `foreach'
/usr/local/lib/ruby/3.2.0/socket.rb:635:in `tcp'
ruby/3.2.0/gems/redis-client-0.22.2/lib/redis_client/ruby_connection.rb:119:in `connect'
ruby/3.2.0/gems/redis-client-0.22.2/lib/redis_client/connection_mixin.rb:11:in `reconnect'
ruby/3.2.0/gems/redis-client-0.22.2/lib/redis_client.rb:742:in `block in connect'
ruby/3.2.0/gems/redis-client-0.22.2/lib/redis_client/middlewares.rb:12:in `connect'
ruby/3.2.0/gems/redis-cluster-client-0.7.11/lib/redis_client/cluster/error_identification.rb:20:in `connect'
ruby/3.2.0/gems/redis-client-0.22.2/lib/redis_client.rb:741:in `connect'
ruby/3.2.0/gems/redis-client-0.22.2/lib/redis_client.rb:732:in `raw_connection'
ruby/3.2.0/gems/redis-client-0.22.2/lib/redis_client.rb:697:in `ensure_connected'
ruby/3.2.0/gems/redis-client-0.22.2/lib/redis_client.rb:292:in `call_v'
ruby/3.2.0/gems/redis-cluster-client-0.7.11/lib/redis_client/cluster/router.rb:82:in `public_send'
ruby/3.2.0/gems/redis-cluster-client-0.7.11/lib/redis_client/cluster/router.rb:82:in `block in try_send'
ruby/3.2.0/gems/redis-cluster-client-0.7.11/lib/redis_client/cluster/router.rb:96:in `handle_redirection'
ruby/3.2.0/gems/redis-cluster-client-0.7.11/lib/redis_client/cluster/router.rb:79:in `try_send'
ruby/3.2.0/gems/redis-cluster-client-0.7.11/lib/redis_client/cluster/router.rb:60:in `send_command'
ruby/3.2.0/gems/redis-cluster-client-0.7.11/lib/redis_client/cluster.rb:35:in `call_v'
ruby/3.2.0/gems/redis-clustering-5.1.0/lib/redis/cluster/client.rb:85:in `block in call_v'
ruby/3.2.0/gems/redis-clustering-5.1.0/lib/redis/cluster/client.rb:104:in `handle_errors'
ruby/3.2.0/gems/redis-clustering-5.1.0/lib/redis/cluster/client.rb:85:in `call_v'
ruby/3.2.0/gems/redis-5.1.0/lib/redis.rb:152:in `block in send_command'
ruby/3.2.0/gems/redis-5.1.0/lib/redis.rb:151:in `synchronize'
ruby/3.2.0/gems/redis-5.1.0/lib/redis.rb:151:in `send_command'
ruby/3.2.0/gems/redis-5.1.0/lib/redis/commands/keys.rb:252:in `del'
supercaracal commented 2 weeks ago

Given the stacktrace says you use the version 0.7.11, I recommend you to use the latest version for redis-cluster-client and redis-clustering gems.

https://github.com/redis-rb/redis-cluster-client/releases

The old version has insufficient implementation to handle such errors after changing the cluster state.

supercaracal commented 2 weeks ago

Feel free to reopen this issue if the latest version still doesn't resolve it.

kasra-sheik commented 2 weeks ago

Thanks @supercaracal! Upgrading seemed to fix my connection errors.