StackExchange / StackExchange.Redis

General purpose redis client
https://stackexchange.github.io/StackExchange.Redis/
Other
5.86k stars 1.5k forks source link

Unexpected response exception #2577

Closed WeihanLi closed 9 months ago

WeihanLi commented 9 months ago

When read from redis slave, get unexpected response exception, while the redis slave is read-only and there's no error when from redis master writes, how could we handle or avoid this kind of error? thanks very much

Unexpected response to XREAD: SimpleString: none

Unexpected response to TYPE: MultiBulk: 1 items

Unexpected response to XREVRANGE: BulkString: 570 bytes
mgravell commented 9 months ago

That seems odd; how are you targeting the replica here - via the optional enum? Or...? And is there anything unusual in your usage?

Does this happen reliably / always? Do you have code that shows such? And: are you talking to an unusual server in any way? (For example, "not actually redis, but redis compatible")

The short version here is "this should just work", so: I'm trying to see how we can reproduce this, as a precursor to identifying and fixing it.

WeihanLi commented 9 months ago

@mgravell thanks very much for the quick reply

how are you targeting the replica here - via the optional enum? Or...? And is there anything unusual in your usage?

What does it mean targeting the replica, sorry that I do not really get it.

The Redis we're using is the master-slave mode, and the slave is readonly, we use Redis directly

The API we use likes follows:

await redis.StreamReadAsync(_typedCacheConfig.Subscription, latestMsgId, count: _typedCacheConfig.PrefetchCount)

await redis.StreamRangeAsync(_typedCacheConfig.Subscription, maxId: msgId, count: 1,
                messageOrder: Order.Descending)

await redis.KeyTypeAsync(name)

This exception happened recently, no experience with this before, think it may relate to the Redis server, but no errors from other services using the same Redis server.

mgravell commented 9 months ago

When read from redis slave

replica and slave are synonyms in this context (with replica being the terminology redis uses more recently); my question is: "what does your setup/code look like, for targeting the replica?". Based on the comment above, it sounds like your config string only mentions the replica endpoint, is that correct? (the library can connect to both, and use different nodes for different requests).

This sounds like a de-sync between client and server, hence me trying to find whether it happens routinely / all the time. And: what is the server here? Is this on-prem redis vanilla, or some hosted cloud offering? Or something redis-like (I think AWS offer a redis-like server that isn't redis, for example). I'm asking because the setup and server change a few things - for example, hosted offerings often use a proxy on the connection, so I'm wondering if the proxy might have de-sync'd. Since redis doesn't have a correlation identifier, this is hard to investigate.

WeihanLi commented 9 months ago

what is the server here

We're using a self-hosted contained-based Redis deployed in k8s

it sounds like your config string only mentions the replica endpoint, is that correct?

Yeah, we only have the replica host address in the connection config

From the Redis metrics, it seems the replica is up with the master

image

mgravell commented 9 months ago

And again: does this happen repeatedly / reliably? Trying to think how we can investigate this.

WeihanLi commented 9 months ago

Yes, it happens repeatedly these days from 10 days ago, may need more research and testing

mgravell commented 9 months ago

ok, allow me to rephrase: is there code, that I can run, that would show it happening reliably, so that I can investigate? without a repro, I don't have much of a route in here. Oh, and : what server version and library version is this?

WeihanLi commented 9 months ago

The errors disappeared when I restarted the service, I see OOM exception in the early error logs.

image

So it may not relate to the Redis server(6.0.5) or the library(2.2.4), but to the memory pressure, can not reproduce it now, would try more to reproduce, so sorry for your time.

DTTerastar commented 3 months ago

We are seeming something similar to this. It happens on one container every week or so. The heartbeat option has been tried but it takes > 20 mins to correct it.

rlenoci-conga commented 3 months ago

I'm seeing a similar issue. Many occurrences of StackExchange.Redis.RedisConnectionException: Unexpected response to ZADD: MultiBulk: 26 items.

Could many of these occurrences lead to OutOfMemoryExceptions? Would low memory possibly cause these kinds of RedisConnectionExceptions?

Other occurrences of StackExchange.Redis.RedisConnectionException that happen around the same time include

Unexpected response to HGETALL: Integer: 0 Unexpected response to HMSET: Integer: 1