joblib / loky

Robust and reusable Executor for joblib
http://loky.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
520 stars 47 forks source link

BUG call_queue size is not updated on resize #396

Open tomMoral opened 1 year ago

tomMoral commented 1 year ago

When resizing the reusable_executor, the call_queue is not resized. It has a size 2 * n_workers + 1 originally so if the executor is enlarged a lot, this can cause sub-optimal performances as some workers might starve.

We might want to consider if we can resize the call_queue (probably hard as the size is defined with a BoundedSemaphore and it is hard to change it in all workers) or find an heuristic that works well.

ogrisel commented 1 year ago

This problem is the cause of #397 (slow shutdown when the call queue size is much slower than the max_workers used to resize the pool.

A partial mitigation is implemented in #399 but we should still fix the root cause to avoid starving workers after a large resize.