Netflix / Hystrix

Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.
24.08k stars 4.7k forks source link

Threadpool rejection #2001

Open ek-ex opened 4 years ago

ek-ex commented 4 years ago

Hi,

I'm tuning the hystrix configuration for a service in production. There is something that Im not able to figure out and I would like your help.

For this I created a service A that is running in port 8081, and it returns a class with random UUID. There is an admin endpoint where I can change a Thread.sleep in that service. Then I created a service B that calls service A behind a hystrix command using RestTemplate with connection pooling.

Hystrix threadpool = 10 Requests per second = 20 RestTemplate connections = 50 All other default hystrix values. With no artificial delays, the duration of service A measured from service B is ~5ms.

The problem is that I start getting Rejections as I start increasing the duration of service A. My math tells me that at 20rps and 10 hystrix threadpool size, I can increase the duration all the way around ~500ms before connections starts to be rejected because the threadpool is full.

However, even at 250ms duration, I get around 2% of rejections.

java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@11c1be31[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@4c903666[Wrapped task = null]] rejected from java.util.concurrent.ThreadPoolExecutor@2694a13b[Running, pool size = 10, active threads = 10, queued tasks = 0, completed tasks = 11493] at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055) ~[na:na]

Any pointers I would appreciate it. Thank you