Closed shivam-tripathi closed 4 years ago
@shivam-tripathi I have never seen a case where it would sit and hang inside the client code forever. Can you provide a more detailed example script and cluster setup that would replicate the issue so i can see if i can debug and trace the error down any further?
If for example try to run a client against a cluster that is completely shutdown i get the following error and stacktrace in my client side.
DEBUG:rediscluster.nodemanager:[
{
"host": "127.0.0.1",
"port": 7000
},
{
"host": "127.0.0.1",
"port": 7001
},
{
"host": "127.0.0.1",
"port": 7002
},
{
"host": "127.0.0.1",
"port": 7003
},
{
"host": "127.0.0.1",
"port": 7004
},
{
"host": "127.0.0.1",
"port": 7005
}
]
Traceback (most recent call last):
File "rediscluster_failover_test.py", line 25, in <module>
rc = RedisCluster(startup_nodes=startup_nodes, decode_responses=True)
File "/home/grok/github/redis-py-cluster/rediscluster/client.py", line 371, in __init__
**kwargs
File "/home/grok/github/redis-py-cluster/rediscluster/connection.py", line 160, in __init__
self.nodes.initialize()
File "/home/grok/github/redis-py-cluster/rediscluster/nodemanager.py", line 270, in initialize
raise RedisClusterException("Redis Cluster cannot be connected. Please provide at least one reachable node.")
rediscluster.exceptions.RedisClusterException: Redis Cluster cannot be connected. Please provide at least one reachable node.
Exception ignored in: <object repr() failed>
Traceback (most recent call last):
File "/home/grok/.virtualenvs/redis/lib/python3.6/site-packages/redis/client.py", line 885, in __del__
self.close()
File "/home/grok/.virtualenvs/redis/lib/python3.6/site-packages/redis/client.py", line 888, in close
conn = self.connection
AttributeError: 'RedisCluster' object has no attribute 'connection'
None of the nodes is reachable in any sense and this error is what i am expecting in a situation where no nodes is reachable.
@shivam-tripathi Please post more detail steps to reproduce the problem in order for me to help any further
@Grokzen Hi, the issue I found was when the first node in the list of startup_nodes is down. More details are as follows:
The config.REDIS_CONF
looks something like this:
{
"cluster": {
"hosts":[
{"host":"a.a.a.a","port":6379},
{"host":"b.b.b.b","port":6379},
{"host":"c.c.c.c","port":6379},
{"host":"d.d.d.d","port":6379},
{"host":"e.e.e.e","port":6379},
{"host":"f.f.f.f","port":6379}
]
}
}
Of this, node a.a.a.a:6379
is down. The node b.b.b.b:6379
, c.c.c.c:6379
, d.d.d.d:6379
are available. After this node e.e.e.e:6379
is again down.
The wrapper code utilizing rediscluster
to connect with the cluster is as follows:
from utils.Config import Config
from rediscluster import RedisCluster
from redis import Redis
class QRedis: config = Config() redis = None def init(self): self.__redis = RedisCluster(startup_nodes=self.__config.REDIS_CONF['cluster']['hosts'], decode_responses=True)
def hscan_iter(self, hset_name, count=100):
return self.__redis.hscan_iter(hset_name, count=count)
def run(self, cmd, *kwargs):
return self.__redis[cmd](*kwargs)
def client(self):
return self.__redis
This for some reason hangs with no additional logs.
3. If I do:
```python
self.__redis = RedisCluster(startup_nodes=self.__config.REDIS_CONF['cluster']['hosts'][1:], decode_responses=True)
it connects and works as expected. Note that the node b.b.b.b:6379
is available. I am slicing the array so that first node is removed from the startup_nodes.
Similarly, self.__config.REDIS_CONF['cluster']['hosts'][2:]
and self.__config.REDIS_CONF['cluster']['hosts'][3:]
work. However, self.__config.REDIS_CONF['cluster']['hosts'][4:]
again hangs, which corresponds to next dead node i.e. e.e.e.e:6379
.
Python 3.6.6
. The command pip freeze
gives redis-py-cluster==2.0.0
.This is everything I could collect which I thought could be of help. Let me know if this is enough, and if anything additional is required.
Thank you so much for your amazing work. 😄
@shivam-tripathi It kinda makes no sense that it should only fail if the first one do not connect O.o i will have to dig deeper into this but i would recommen that you try it with the RC2 release that is on pypi now and see if it behaves the same or not. But i will run a test against a local cluster and see becuase i have not seen that problem myself so far.
@shivam-tripathi I attempted this issue now locally and i am unable to reproduce the error and to me it seems and looks like the code work out as expected and i don't know what your issue is really or why it happens.
I attempted to run the following client code on my local redis cluster with the port 7000 node shutdown
from rediscluster import RedisCluster
startup_nodes = [
{"host": "127.0.0.1", "port": "7000"},
{"host": "127.0.0.1", "port": "7001"},
{"host": "127.0.0.1", "port": "7002"},
{"host": "127.0.0.1", "port": "7003"},
{"host": "127.0.0.1", "port": "7004"},
{"host": "127.0.0.1", "port": "7005"},
]
rc = RedisCluster(
startup_nodes=startup_nodes,
decode_responses=True,
)
from pprint import pprint
pprint(rc.cluster_info())
print(rc.set('foobar', 'asd'))
print(rc.get('foobar'))
And that code above works as expected and without any issues at all and it do not block or anything.
The only thing that i can see in your code example above is that if you are using hscan_iter and that is where it hangs and not during the cluster initialization steps?
Also i would recommend that you take down the master branch of this repo or installs 2.0.99RC2 release from pypi and use that code in your example and see if that helps out anything for you.
If that do not help and your problem persist then you have to provide much more detailed and deeper debugging into what part that fails in the code and not so much in the top layer of the client and your code. Please open a new issue with that if the problem still remains in the new client version.
The list of start_nodes is fetched from a configuration server, which is then fed to RedisCluster. During runtime, any of the redis node can be unreachable due to a number of reasons, and this is not reflected in the config.
If I try to connect to connect to a cluster with startup_nodes say
[{'host': <a.host>, 'port': <a.port>}, ....]
and for some reasona.host
is down, the commandconn = RedisCluster(startup_nodes=startup_nodes, decode_responses=True)
hangs. It doesn't move over to the next active node in the list. I think this might not be ideal.Version: redis-py-cluster==2.0.0