Open tomMoral opened 1 year ago
This problem is the cause of #397 (slow shutdown when the call queue size is much slower than the max_workers
used to resize the pool.
A partial mitigation is implemented in #399 but we should still fix the root cause to avoid starving workers after a large resize.
When resizing the
reusable_executor
, thecall_queue
is not resized. It has a size2 * n_workers + 1
originally so if the executor is enlarged a lot, this can cause sub-optimal performances as some workers might starve.We might want to consider if we can resize the
call_queue
(probably hard as the size is defined with aBoundedSemaphore
and it is hard to change it in all workers) or find an heuristic that works well.