StackExchange / StackExchange.Redis

General purpose redis client
https://stackexchange.github.io/StackExchange.Redis/
Other
5.85k stars 1.5k forks source link

Why StackExchange.Redis increases count of busy worker thread? #2608

Open ranpariyachetan opened 7 months ago

ranpariyachetan commented 7 months ago

Hi,

I am working on a .NET 4.8 ASP.NET MVC application. The application is entirely Sync. My goal is to switch applications cache layer from Memcached Cluster to Redis Cluster. Both these clusters are hosted on AWS Elasticache.

As of now I am comparing latencies of get/set operations between Memcached and Redis. I started seeing lot of Timeout errors and also large latencies (more than 1ms) with get/set operations with Redis.

I followed the documentation here (https://stackexchange.github.io/StackExchange.Redis/Timeouts.html) and tuned Min Worker Thread configuration because the exception messages indicated that busy worker count was more than min worker thread count.

I have a following curiosity around worker thread usage by StackExchange.Redis client.

My application has throughput of 3000 RPM per server running on 50 servers (each server has 8 CPUs) and Min Worker Thread count set to default value. With application not performing any Redis operation I do see the busy worker count stays between 7 and 16. But when application starts integrating with Redis the Busy worker thread count increases to as high at 40 or sometimes evern higher. And that eventually leads to redis operations timeouts and/or redis operations taking more than 1ms to complete. This increases overall latency of the application breaching SLO thresholds. The only difference here is application is performing get/set operations on Redis.

Singleton ConnectionMultiplexer is used for Redis operations. All the operations are on String type. Payload size varies from 10byte to 3.5Kb. Application servers and Redis Cluster are within same VPC and AWS region.

From other issues and documentations I learned that StackExchange.Redis library does not use threads from .NET Thread pool. Hence I am curious to know why in my case the Busy Worker Thread count increases only when Redis is used and it is well under control otherwise.

What can I do to know which other parts of the application are eating through Worker Threads?

I am kind of blocked because I am not able to predict the impact in production environment when it starts using Redis.

I can provide more information if needed.

Thanks and regards, Chetan Ranpariya

NickCraver commented 6 months ago

If you have an example exception, we can see numbers for sure, but "The application is entirely Sync." is likely your core issue. When you do sync things that are long running, that's going to eat up threads. We'd advise all remote I/O be done with async, specifically so that you don't exhaust the thread pool as you're seeing here. That's the #1 use case for async/await really, because of exactly this issue.

Whether it's a disk read or Redis or SQL or anything: it's all the latency over the network/call and reserving a thread for that amount of time that's at the heart of the issue here. But again, if you have some exception messages which have a lot of counters in there we can point to confirm this and point to more specifics :)