joblib / loky

Robust and reusable Executor for joblib
http://loky.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
528 stars 45 forks source link

Is it safe to use ProcessPoolExecutor from loky? #374

Closed shoubhikraj closed 1 year ago

shoubhikraj commented 1 year ago

The documentation says that the main functionality of loky can be obtained from the get_reusable_executor() function, which manages a single instance of reusable process pool.

However, what happens when I need nested parallelism (i.e. use a process pool inside a process pool)? The reusable executor cannot be used. But I see that loky has its own implementation of ProcessPoolExecutor, which is supposed to be very similar in terms of usage to concurrent.futures implementation.

Am I safe to use just ProcessPoolExecutor from loky, and use pool.submit() or pool.map()? Can it do nested parallelism?

ogrisel commented 1 year ago

Good question. I don't recall what we do in this case. Apparently this does not deadlock but it does not fully respect the contract of not spawning new processes when using get_reusable_executor(max_workers=2):

>>> import loky
>>> import os
>>> def inner(*args):
...     e = loky.get_reusable_executor(max_workers=2)
...     return list(set(e.map(lambda _: os.getpid(), range(10000))))
... 
>>> inner()
[52700, 52701]
>>> inner()
[52700, 52701]
>>> e = loky.get_reusable_executor(max_workers=2)
>>> list(e.map(inner, range(10)))
[[52729, 52731], [52728, 52730], [52729, 52731], [52728, 52730], [52728, 52730], [52729, 52731], [52728, 52730], [52729, 52731], [52728, 52730], [52729, 52731]]

Maybe we just switch to sequential execution when we detect nesting of reusable executor under reusable executor. I could not find any specific related to nesting detection under the reusable pool class itself.

ogrisel commented 1 year ago

On the positive side, it does not deadlock apparently...

ogrisel commented 1 year ago

Maybe @tomMoral remembers more details about this.

shoubhikraj commented 1 year ago

@ogrisel Thanks very much!

I actually ended up using ProcessPoolExecutor in the outer part and then get_reusable_executor() in the inner part of the nested parallel loop. It seems to work so far.

ogrisel commented 1 year ago

In retrospect I think what we observe in https://github.com/joblib/loky/issues/374#issuecomment-1434223927 is the expected behavior:

Each first-level worker process creates its own reusable pool. So it works.