StackExchange / StackExchange.Redis

General purpose redis client
https://stackexchange.github.io/StackExchange.Redis/
Other
5.87k stars 1.51k forks source link

IDatabase.KeyExists returns false in 2.2.88 but times out and throws exception in 2.6.122 #2546

Closed Mike-Zarlenga-IGT closed 11 months ago

Mike-Zarlenga-IGT commented 1 year ago

I have a Redis cluster with some nodes offline.

Using StackExchange.Redis 2.2.88 and calling KeyExists for a key that does not exist, the call returns false quickly.

Upgrading to 2.6.122 and calling KeyExists for a key that does not exist, the call hangs for the timeout then throws a "message timed out in the backlog" exception.

Setting the BacklogPolicy to FastFail removes the timeout delay time but I still get an exception, slightly different that the prior exception, indicating that the problem is one of the offline nodes in the Redis cluster.

Are there any other configuration options I can set to make 2.6.122 behave more like 2.2.88.

Or am I stuck with this behavior and dealing with parsing exceptions when connected to a cluster with some nodes offline?

mgravell commented 1 year ago

I am not entirely convinced that KeyExists would work (returning false) on any version against a down node. To the best of my knowledge the only difference here can possibly be: how quickly we get an exception. You have made me think that cluster-down scenarios probably shouldn't go through the retry mechanisms, because of how serious that situation is. Perhaps with a little grace for if the thing that is down is a replica.

NickCraver commented 12 months ago

@Mike-Zarlenga-IGT I'm a bit confused: what do you expect the behavior to be? The default is to backlog and best-effort try if the node comes back up (rather than failing instantly, as the vast majority of applications would rather a delay than an error), but as you've found you can configure this already.

What exactly should behave differently here? Or is the complaint that the error message changed at all? (to be more informative in this case)

Mike-Zarlenga-IGT commented 12 months ago

Thank you both for the quick replies. Maybe my understanding of KeyExists or Redis clustering is the issue here.

  1. In the case of a down node as long as there's at least 1 Master up, why should I care about that down node?
  2. And is throwing an exception for a down node unique to KeyExists?
NickCraver commented 12 months ago

@Mike-Zarlenga-IGT We'll always prefer a primary if available, but if no primary with that hash slot isn't available, we don't have anywhere to ask, because the owner of that key isn't around. If you have say 3 nodes, the keys are sharded amongst them based on hash slot, they're not all replicated to all nodes. A replica of a node serves the purpose of being a backup in cluster.

For 2) nope, this behavior applies to every command.

Mike-Zarlenga-IGT commented 11 months ago

I did a little more digging and looking at the muxStatus for the connection, I can see that the cluster is a total of 9 nodes and 3 primaries are listed as "DidNotRespond." That, plus the fact that the keys are sharded across primaries explains it. Thank you for the assistance with this, I'll close the issue.