StackExchange / StackExchange.Redis

General purpose redis client
https://stackexchange.github.io/StackExchange.Redis/
Other
5.86k stars 1.5k forks source link

Would appreciate assistance solving timeout issues #2560

Open williammck opened 10 months ago

williammck commented 10 months ago

Hey folks,

I've been battling recent issues with one of our workers regularly getting Redis timeouts and would appreciate some help.

A bit of overview:

I've tried upping the threads in the pool up a bit (set ThreadPoolMinThreads to 16), but that didn't seem to help much. Should I increase it further?

I've looked into SuperSimpleTcp and it seems to be doing everything in tasks/async, so I'd hope we're not blocking, but I can't quite tell.

Any ideas or things I should look at?

Stacktrace examples:

Unhandled exception. StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=0KiB, inbound=0KiB, 7957ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX 1432024, inst: 0, qu: 0, qs: 3, aw: True, bw: SpinningDown, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 15, out-pipe: 0, last-in: 2, cur-in: 15, sync-ops: 0, async-ops: 1012309, serverEndpoint: redis-cache:6379, conn-sec: 1388.4, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=4,Free=32763,Min=16,Max=32767), POOL: (Threads=16,QueuedItems=1,CompletedItems=4070933,Timers=8), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 388
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
Unhandled exception. StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=0KiB, inbound=0KiB, 7957ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX 1432024, inst: 0, qu: 0, qs: 3, aw: True, bw: SpinningDown, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 15, out-pipe: 0, last-in: 2, cur-in: 15, sync-ops: 0, async-ops: 1012309, serverEndpoint: redis-cache:6379, conn-sec: 1388.42, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=4,Free=32763,Min=16,Max=32767), POOL: (Threads=16,QueuedItems=0,CompletedItems=4070936,Timers=8), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 389
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
Unhandled exception. Unhandled exception. StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=0KiB, inbound=0KiB, 6213ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX AXM3558, inst: 0, qu: 1, qs: 8, aw: True, bw: CheckingForTimeoutComplete, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 40, out-pipe: 0, last-in: 2, cur-in: 40, sync-ops: 0, async-ops: 344430, serverEndpoint: redis-cache:6379, conn-sec: 466.82, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=7,Free=32760,Min=16,Max=32767), POOL: (Threads=18,QueuedItems=2,CompletedItems=1400924,Timers=9), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 389
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()
StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=0KiB, inbound=0KiB, 6213ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX AXM3558, inst: 0, qu: 1, qs: 8, aw: True, bw: CheckingForTimeoutComplete, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 40, out-pipe: 0, last-in: 2, cur-in: 40, sync-ops: 0, async-ops: 344430, serverEndpoint: redis-cache:6379, conn-sec: 466.82, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=6,Free=32761,Min=16,Max=32767), POOL: (Threads=18,QueuedItems=1,CompletedItems=1400924,Timers=9), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 389
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()
Unhandled exception. Unhandled exception. StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=1KiB, inbound=0KiB, 6213ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX AXM3558, inst: 0, qu: 0, qs: 8, aw: True, bw: SpinningDown, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 40, out-pipe: 0, last-in: 2, cur-in: 40, sync-ops: 0, async-ops: 344430, serverEndpoint: redis-cache:6379, conn-sec: 466.8, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=6,Free=32761,Min=16,Max=32767), POOL: (Threads=18,QueuedItems=0,CompletedItems=1400923,Timers=8), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 389
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()
StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=0KiB, inbound=0KiB, 6205ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX AXM3558, inst: 0, qu: 1, qs: 8, aw: True, bw: CheckingForTimeoutComplete, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 40, out-pipe: 0, last-in: 2, cur-in: 40, sync-ops: 0, async-ops: 344430, serverEndpoint: redis-cache:6379, conn-sec: 466.82, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=8,Free=32759,Min=16,Max=32767), POOL: (Threads=18,QueuedItems=3,CompletedItems=1400924,Timers=9), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 389
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()
NickCraver commented 10 months ago

What's the max size of payload that you're sending here? It may be bandwidth exhaustion from very large payloads, but that is quite the timeout happening - any idea what size you're sending?

williammck commented 10 months ago

Depends on the key - each value is up to a kilobyte max, but usually hovers around 64B - 128B. We are sending up to a few thousand of these all at once. But even pipelined, and with some headroom, I'd be shocked if we're upwards of a megabyte in one go. What's weird too is it's been behaving a lot more like normal this week, but we're seeing the same amount of load, if not more, on the service and the node it's living on. I'm a bit puzzled as to why that'd be happening.