StackExchange / StackExchange.Redis

General purpose redis client
https://stackexchange.github.io/StackExchange.Redis/
Other
5.91k stars 1.51k forks source link

Would appreciate assistance solving timeout issues #2560

Open williammck opened 1 year ago

williammck commented 1 year ago

Hey folks,

I've been battling recent issues with one of our workers regularly getting Redis timeouts and would appreciate some help.

A bit of overview:

I've tried upping the threads in the pool up a bit (set ThreadPoolMinThreads to 16), but that didn't seem to help much. Should I increase it further?

I've looked into SuperSimpleTcp and it seems to be doing everything in tasks/async, so I'd hope we're not blocking, but I can't quite tell.

Any ideas or things I should look at?

Stacktrace examples:

Unhandled exception. StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=0KiB, inbound=0KiB, 7957ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX 1432024, inst: 0, qu: 0, qs: 3, aw: True, bw: SpinningDown, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 15, out-pipe: 0, last-in: 2, cur-in: 15, sync-ops: 0, async-ops: 1012309, serverEndpoint: redis-cache:6379, conn-sec: 1388.4, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=4,Free=32763,Min=16,Max=32767), POOL: (Threads=16,QueuedItems=1,CompletedItems=4070933,Timers=8), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 388
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
Unhandled exception. StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=0KiB, inbound=0KiB, 7957ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX 1432024, inst: 0, qu: 0, qs: 3, aw: True, bw: SpinningDown, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 15, out-pipe: 0, last-in: 2, cur-in: 15, sync-ops: 0, async-ops: 1012309, serverEndpoint: redis-cache:6379, conn-sec: 1388.42, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=4,Free=32763,Min=16,Max=32767), POOL: (Threads=16,QueuedItems=0,CompletedItems=4070936,Timers=8), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 389
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
Unhandled exception. Unhandled exception. StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=0KiB, inbound=0KiB, 6213ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX AXM3558, inst: 0, qu: 1, qs: 8, aw: True, bw: CheckingForTimeoutComplete, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 40, out-pipe: 0, last-in: 2, cur-in: 40, sync-ops: 0, async-ops: 344430, serverEndpoint: redis-cache:6379, conn-sec: 466.82, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=7,Free=32760,Min=16,Max=32767), POOL: (Threads=18,QueuedItems=2,CompletedItems=1400924,Timers=9), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 389
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()
StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=0KiB, inbound=0KiB, 6213ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX AXM3558, inst: 0, qu: 1, qs: 8, aw: True, bw: CheckingForTimeoutComplete, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 40, out-pipe: 0, last-in: 2, cur-in: 40, sync-ops: 0, async-ops: 344430, serverEndpoint: redis-cache:6379, conn-sec: 466.82, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=6,Free=32761,Min=16,Max=32767), POOL: (Threads=18,QueuedItems=1,CompletedItems=1400924,Timers=9), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 389
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()
Unhandled exception. Unhandled exception. StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=1KiB, inbound=0KiB, 6213ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX AXM3558, inst: 0, qu: 0, qs: 8, aw: True, bw: SpinningDown, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 40, out-pipe: 0, last-in: 2, cur-in: 40, sync-ops: 0, async-ops: 344430, serverEndpoint: redis-cache:6379, conn-sec: 466.8, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=6,Free=32761,Min=16,Max=32767), POOL: (Threads=18,QueuedItems=0,CompletedItems=1400923,Timers=8), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 389
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()
StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=0KiB, inbound=0KiB, 6205ms elapsed, timeout is 5000ms), command=SETEX, next: SETEX AXM3558, inst: 0, qu: 1, qs: 8, aw: True, bw: CheckingForTimeoutComplete, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 40, out-pipe: 0, last-in: 2, cur-in: 40, sync-ops: 0, async-ops: 344430, serverEndpoint: redis-cache:6379, conn-sec: 466.82, aoc: 1, mc: 1/1/0, mgr: 9 of 10 available, clientName: slurper-service-74c679b4cd-bqdjg(SE.Redis-v2.6.122.38350), IOCP: (Busy=0,Free=1000,Min=4,Max=1000), WORKER: (Busy=8,Free=32759,Min=16,Max=32767), POOL: (Threads=18,QueuedItems=3,CompletedItems=1400924,Timers=9), v: 2.6.122.38350 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
   at VATSIM.Network.Slurper.Service.Services.FsdService.AddToRedis(Object sender, ElapsedEventArgs e) in /src/VATSIM.Network.Slurper.Service/Services/FsdService.cs:line 389
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()
NickCraver commented 1 year ago

What's the max size of payload that you're sending here? It may be bandwidth exhaustion from very large payloads, but that is quite the timeout happening - any idea what size you're sending?

williammck commented 1 year ago

Depends on the key - each value is up to a kilobyte max, but usually hovers around 64B - 128B. We are sending up to a few thousand of these all at once. But even pipelined, and with some headroom, I'd be shocked if we're upwards of a megabyte in one go. What's weird too is it's been behaving a lot more like normal this week, but we're seeing the same amount of load, if not more, on the service and the node it's living on. I'm a bit puzzled as to why that'd be happening.