Open gordo-veda opened 2 months ago
This seems related: https://github.com/redis/redis-py/pull/2575
this is a disgusting hack to avoid forking redis-py, but it definitely improves stability while under load during scaling events when using elasticache.
basically we are just runtime-patching NodesManager
to try the first node if an IndexError
happens, as demonstrated in the related MR: https://github.com/redis/redis-py/pull/2575
def wrap_get_node_from_slot(cls):
"""Work around redis-py cluster incompatibility with elasticache."""
if hasattr(cls, "original_get_node_from_slot"):
return cls
original_get_node_from_slot = cls.get_node_from_slot
@wraps(original_get_node_from_slot)
def get_node_from_slot_with_fallback(self, slot, read_from_replicas=False, server_type=None):
"""Fall-back to primary when elasticache scales and breaks the redis-py cluster client."""
try:
return original_get_node_from_slot(self, slot, read_from_replicas, server_type)
except IndexError:
# https://github.com/redis/redis-py/pull/2575
return self.slots_cache[slot][0]
cls.original_get_node_from_slot = original_get_node_from_slot
cls.get_node_from_slot = get_node_from_slot_with_fallback
wrap_get_node_from_slot(NodesManager)
i understand there are some politics around AWS/Redis and forks and such, but regardless of anyone's opinions, i hope this is helpful for anyone else encountering this problem.
It seems that the slots cache throws an
IndexError
sometimes during an online reshard.redis-py 5.0.4
please let me know if i can provide any additional details, or help reproduce this.