uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

Stalling Pool.map #181

Closed gideonsimpson closed 4 years ago

gideonsimpson commented 4 years ago

I'm trying to wrap pathos multiprocessing around an openmm computation, and I'm encountering the following issue. When I call Pool.map, things stall and never seem to complete. I have narrowed it down to a line of code that constructs a PySWIG object. When I hit Ctrl-C to get out, it dumps this, if that is at all useful:

KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-10-fcef68c9a137> in <module>
----> 1 pool.map(mutate,itertools.repeat(x0,4), itertools.count(100), itertools.repeat(10,4))

/anaconda3/lib/python3.7/site-packages/pathos/multiprocessing.py in map(self, f, *args, **kwds)
    135         AbstractWorkerPool._AbstractWorkerPool__map(self, f, *args, **kwds)
    136         _pool = self._serve()
--> 137         return _pool.map(star(f), zip(*args)) # chunksize
    138     map.__doc__ = AbstractWorkerPool.map.__doc__
    139     def imap(self, f, *args, **kwds):

/anaconda3/lib/python3.7/site-packages/multiprocess/pool.py in map(self, func, iterable, chunksize)
    266         in a list that is returned.
    267         '''
--> 268         return self._map_async(func, iterable, mapstar, chunksize).get()
    269 
    270     def starmap(self, func, iterable, chunksize=None):

/anaconda3/lib/python3.7/site-packages/multiprocess/pool.py in get(self, timeout)
    649 
    650     def get(self, timeout=None):
--> 651         self.wait(timeout)
    652         if not self.ready():
    653             raise TimeoutError

/anaconda3/lib/python3.7/site-packages/multiprocess/pool.py in wait(self, timeout)
    646 
    647     def wait(self, timeout=None):
--> 648         self._event.wait(timeout)
    649 
    650     def get(self, timeout=None):

/anaconda3/lib/python3.7/threading.py in wait(self, timeout)
    550             signaled = self._flag
    551             if not signaled:
--> 552                 signaled = self._cond.wait(timeout)
    553             return signaled
    554 

/anaconda3/lib/python3.7/threading.py in wait(self, timeout)
    294         try:    # restore state no matter what (e.g., KeyboardInterrupt)
    295             if timeout is None:
--> 296                 waiter.acquire()
    297                 gotit = True
    298             else:

KeyboardInterrupt: 
gideonsimpson commented 4 years ago

This seems to have been an issue with openmm and not pathos.