uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

suggest pathos support initializer parameter for pathos.multiprocessing.ProcessPool #220

Closed liningbo closed 1 year ago

liningbo commented 3 years ago

Sometimes We need to do some Initialization work when start new process, but pathos.multiprocessing.ProcessPool seem not support it. The multiprocessing.Pool has the initializer parameter, so suggest pathos support initializer parameter for pathos.multiprocessing.ProcessPool as multiprocessing.Pool. Thank you.

class Pool(object): ''' Class which supports an async version of applying functions to arguments. ''' _wrap_exception = True

@staticmethod
def Process(ctx, *args, **kwds):
    return ctx.Process(*args, **kwds)

def __init__(self, processes=None, initializer=None, initargs=(),
             maxtasksperchild=None, context=None):
    # Attributes initialized early to make sure that they exist in
    # __del__() if __init__() raises an exception
    self._pool = []
    self._state = INIT

    self._ctx = context or get_context()
    self._setup_queues()
    self._taskqueue = queue.SimpleQueue()
    # The _change_notifier queue exist to wake up self._handle_workers()
    # when the cache (self._cache) is empty or when there is a change in
    # the _state variable of the thread that runs _handle_workers.
    self._change_notifier = self._ctx.SimpleQueue()
    self._cache = _PoolCache(notifier=self._change_notifier)
    self._maxtasksperchild = maxtasksperchild
    self._initializer = initializer
    self._initargs = initargs
mmckerns commented 3 years ago

There is support for the initializer in pathos.multiprocessing._ProcessPool. Is that what you wanted?

liningbo commented 2 years ago

@mmckerns pathos.multiprocessing._ProcessPool is just the alias of multiprocessing.Pool, multiprocessing.Pool do not work well in some multiprocess case, because of serialization problem, but pathos.multiprocessing._ProcessPool do well, so I hope pathos.multiprocessing._ProcessPool can support the initializer parameter. Thank you for reply.

mmckerns commented 2 years ago

pathos.multiprocessing._ProcessPool is identical to multiprocess.Pool, but not multiprocessing.Pool. The former uses dill, while the latter uses pickle. You will note that multiprocess.Pool (i.e. pathos.multiprocessing._ProcessPool) supports an initializer and has better serialization.

liningbo commented 2 years ago

@mmckerns yes, my last comment make a mistake as you said. I want to say that pathos.multiprocessing.ProcessPool works well , but multiprocessing.Pool do not work well sometime beacase of serialization problem. I hope pathos.multiprocessing.ProcessPool can support an initializer and has better serialization.

mmckerns commented 2 years ago

To be clear, the initializer argument is supported in pathos in pathos.multiprocessing._ProcessPool and in multiprocess in Pool. See #138 for a similar issue and response. The interface for pathos.multiprocessing.ProcessPool may support an initializer in the near future, however.