StackExchange / StackExchange.Redis

General purpose redis client
https://stackexchange.github.io/StackExchange.Redis/
Other
5.85k stars 1.5k forks source link

Timeout with larger Queue-Awaiting-Write #2615

Open rafael-santiago-take opened 7 months ago

rafael-santiago-take commented 7 months ago

Hello, We are getting Redis Timeout Exceptions that I belive i'ts being caused by Queue-Awaiting-Write. Why we are getting this error?

Examples below:

StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=42KiB, inbound=124KiB, 5024ms elapsed, timeout is 5000ms), command=SET, next: SET keyexample, inst: 0, qu: 800, qs: 0, aw: True, bw: WritingMessage, rs: DequeueResult, ws: Flushed, in: 0, last-in: 112, cur-in: 382, sync-ops: 0, async-ops: 80850436, serverEndpoint: serverendpoint:25896, conn-sec: 47728.7, aoc: 0, mc: 1/1/0, mgr: 9 of 10 available, clientName: application(SE.Redis-v2.6.111.64013), IOCP: (Busy=0,Free=1000,Min=256,Max=1000), WORKER: (Busy=20,Free=32747,Min=1024,Max=32767), POOL: (Threads=143,QueuedItems=3,CompletedItems=1002836049), v: 2.6.111.64013

StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=42KiB, inbound=124KiB, 5024ms elapsed, timeout is 5000ms), command=SET, next: SET keyexample, inst: 0, qu: 800, qs: 0, aw: True, bw: WritingMessage, rs: DequeueResult, ws: Flushed, in: 0, last-in: 112, cur-in: 382, sync-ops: 0, async-ops: 80850436, serverEndpoint: serverendpoint:25896, conn-sec: 47728.7, aoc: 0, mc: 1/1/0, mgr: 9 of 10 available, clientName: application(SE.Redis-v2.6.111.64013), IOCP: (Busy=0,Free=1000,Min=256,Max=1000), WORKER: (Busy=18,Free=32749,Min=1024,Max=32767), POOL: (Threads=143,QueuedItems=2,CompletedItems=1002836042), v: 2.6.111.64013

NickCraver commented 7 months ago

What kind of payload are we talking about for this workload? It seems like given the thread count and sizes I'm seeing, perhaps we're seeing bandwidth exhaustion overall.

rafael-santiago-take commented 6 months ago

Hi @NickCraver, our payloads can be of different sizes and it can be high sometimes. How are you seeing a high "thread count and sizes"? 123 Threads is a high number? Can a pool of ConnectionMultiplexer objects improve this?

NickCraver commented 6 months ago

I'd recommend looking at a bandwidth graph here and see what you're spiking to on the physical connection. What's high on thread count is relative - depending on what the app's doing. Given you have 0 and 18 in the IOCP and worker pools yet 143 total, that's an odd thing to me and represents a lot of threads elsewhere in use possibly tying up CPU.

Whether a pool will help depends entirely on what you're hitting. If we're exhausting the physical connection, then a pool wouldn't do much. Sometimes with microbursting we'll see this, but usually not in the 5 second range unless it's quite prolonged or payloads are pretty big.