uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.39k stars 89 forks source link

How to run non-deamon pool #169

Closed Borda closed 5 years ago

Borda commented 5 years ago

Hello, I am wondering if there is an option how to run non-daemon processing pool within pathos. My goal is to run a processing pool within another processing work. I have found a couple of tricks for standard multiprocessing package but none of them works for pathos...

from pathos.helpers import mp
class NoDaemonProcess(mp.Process):
    # make 'daemon' attribute always return False
    def _get_daemon(self):
        return False
    def _set_daemon(self, value):
        pass
    daemon = property(_get_daemon, _set_daemon)

from pathos.multiprocessing import ProcessPool
class NoDaemonProcessPool(ProcessPool):
    Process = NoDaemonProcess

where instead of multiprocessing.pool.Pool I am using pathos.multiprocessing.ProcessPool and the error message is

  File ".../experiments.py", line 390, in iterate_map_parallel
    for out in mapping(wrap_func, iterate_vals):
  File "/home/jb/.local/lib/python3.6/site-packages/multiprocess/pool.py", line 735, in next
    raise value
AssertionError: daemonic processes are not allowed to have children
mmckerns commented 5 years ago

Sorry for the slow response. pathos pools are wrappers around multiprocess pools. Have you tried using pathos.multiprocessing.Pool? That's the raw multiprocess.Pool object without the pathos interface wrapper.

Borda commented 5 years ago

I played with it for a while and I believe that I tried this too, but it ended the same... :(

mmckerns commented 5 years ago

I tried your code, both using ProcessPool and Pool from pathos.multiprocessing and I don't receive an error.

from pathos.helpers import mp
class NoDaemonProcess(mp.Process):
    # make 'daemon' attribute always return False
    def _get_daemon(self):
        return False
    def _set_daemon(self, value):
        pass
    daemon = property(_get_daemon, _set_daemon)

from pathos.multiprocessing import ProcessPool
class NoDaemonProcessPool(ProcessPool):
    Process = NoDaemonProcess

if __name__ == '__main__':
    p = NoDaemonProcessPool()
    x = ProcessPool().map(lambda x:x, p.map(lambda x:x, range(4)))
    print(x)

Is this not what you intended to do? It works for me in all versions of python I have, using the most recent development versions of dill, multiprocess, and pathos.

mmckerns commented 5 years ago

Also works in the interpreter (only tested 2.7, as opposed to all versions above)

Python 2.7.16 (default, Apr  1 2019, 14:50:56) 
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from pathos.helpers import mp
>>> class NoDaemonProcess(mp.Process):
...     # make 'daemon' attribute always return False
...     def _get_daemon(self):
...         return False
...     def _set_daemon(self, value):
...         pass
...     daemon = property(_get_daemon, _set_daemon)
... 
>>> from pathos.multiprocessing import ProcessPool
>>> class NoDaemonProcessPool(ProcessPool):
...     Process = NoDaemonProcess
... 
>>> p = NoDaemonProcessPool()
>>> ProcessPool().map(lambda x:x, p.map(lambda x:x, range(4)))
[0, 1, 2, 3]
>>> 
Borda commented 5 years ago

@mmckerns it seems that your examples run fine, but somehow it crashes in my codes... Using your classes I have defined my class with some parallel processing:

class Cls(object):
    """Sample

    >>> Cls().parallel()
    """
    vals = range(5)
    def _sum(self, nb):
        return NoDaemonProcessPool(2).map(sum, [self.vals] * nb)
    def parallel(self):
        return NoDaemonProcessPool(2).map(self._sum, range(10))

crashes with following

Traceback (most recent call last):
      File "/home/jb/Applications/PyCharm-2019/helpers/pycharm/docrunner.py", line 140, in __run
        compileflags, 1), test.globs)
      File "<doctest Cls[0]>", line 1, in <module>
        Cls().parallel()
      File ".../experiments.py", line 500, in parallel
        return NoDaemonProcessPool(2).map(self._sum, range(10))
      File "/home/jb/.local/lib/python3.6/site-packages/pathos/multiprocessing.py", line 137, in map
        return _pool.map(star(f), zip(*args)) # chunksize
      File "/home/jb/.local/lib/python3.6/site-packages/multiprocess/pool.py", line 266, in map
        return self._map_async(func, iterable, mapstar, chunksize).get()
      File "/home/jb/.local/lib/python3.6/site-packages/multiprocess/pool.py", line 644, in get
        raise self._value
    AssertionError: daemonic processes are not allowed to have children

I believe now that the problem is in my integrating to a class calling own method...

Borda commented 5 years ago

it runs locally fine but executing your examples on CI it fails for both py2 and py3 https://circleci.com/gh/Borda/BIRL/1417 https://circleci.com/gh/Borda/BIRL/1418

mmckerns commented 5 years ago

@Borda: Not sure what's going on with your tests... but the failures in the two links above are potentially due to other issues...

1417: UNEXPECTED EXCEPTION: TypeError("cannot serialize '_io.FileIO' object",) 1418: UNEXPECTED EXCEPTION: TypeError("'NoneType' object is not callable",)