esa / pygmo2

A Python platform to perform parallel computations of optimisation tasks (global and local) via the asynchronous generalized island model.
https://esa.github.io/pygmo2/
Mozilla Public License 2.0
414 stars 56 forks source link

allowing nested parallelism in python multiprocessing #24

Open darioizzo opened 4 years ago

darioizzo commented 4 years ago

Right now the use of an archipelago with, for example, a bfe based algo will not work if python multiprocessing islands are used with a multiprocessing bfe. The reason is that daemonic processes do not allow nested parallelism.

AssertionError: daemonic processes are not allowed to have children

A possible solution would be the use of non daemonic processes in the python mp module.

See related discussions: https://stackoverflow.com/questions/28491558/launching-nested-processes-in-multiprocessing https://stackoverflow.com/questions/6974695/python-process-pool-non-daemonic

darioizzo commented 4 years ago

This simple code snippet illustrates the issue:

import pygmo as pg
class my_uda:
    def __init__(self):
        self.bfe = pg.bfe(pg.mp_bfe())
    def evolve(self, pop):
        n = pop.problem.get_nx()
        self.bfe(prob, [1]*n*len(pop.get_x()))
        return pop
algo = pg.algorithm(my_uda())
prob = pg.problem(pg.ackley(10))
a = pg.archipelago(n = 8, algo = algo, prob = prob, pop_size = 20)
a.evolve(20)
a.wait_check()
darioizzo commented 4 years ago

For the case of a pool, one could use this: https://stackoverflow.com/questions/6974695/python-process-pool-non-daemonic

bluescarni commented 4 years ago

The problem in my view is that if even we manage to coerce Python into supporting this, we anyway end up in a situation of nested parallelism with heavy resource oversubscription.

E.g., on an eight core machine, the example above ends up firing up 64 processes (8 islands with 8 processes per algorithm).

Perhaps we should investigate some Pythonic multiprocessing paradigm that does the right thing with respect to nested parallelism? I.e., ipyparallel, dask? Basically, we need TBB for Python to support properly this use case.

darioizzo commented 4 years ago

That would be the dask island and a bfe_dask right?

bluescarni commented 4 years ago

That would be the dask island and a bfe_dask right?

I did some reading yesterday evening and I don't think so any more unfortunately. From what I have understood, dask does support some form of nested parallelism but apparently it requires you to define a graph of tasks in order to do so, which is unfeasible for the pygmo use case.

I am not 100% sure I understood correctly however, perhaps there's still a chance it would work... but for sure we would need to spend some time working with dask in order to understand it.

bluescarni commented 2 years ago

So I just discovered that Intel now also provides a version of TBB for Python, so perhaps there's some hope for nested parallelism to work:

https://anaconda.org/conda-forge/tbb4py https://pypi.org/project/tbb/