uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.36k stars 91 forks source link

pathos.pools.ProcessPool deadlock/hang on exceptions #266

Open oliver-s-lee opened 1 year ago

oliver-s-lee commented 1 year ago

Hi there, thanks for the nice code!

I'm encountering a problem where my scripts will hang indefinitely on close if an exception is raised after a ProcessPool has been created (even if the exception itself is handled in the body of the program, and even if the pool is not being used at the time of the exception). The script can only be stopped by using ctrl+c, which gives this monster stacktrace:

Stack Trace ``` ... <-- script does work here ^C <-- deliberately interrupted here Process ForkPoolWorker-10: script exiting <-- exit message from the script Process ForkPoolWorker-7: Process ForkPoolWorker-3: Process ForkPoolWorker-13: Process ForkPoolWorker-4: Process ForkPoolWorker-8: Process ForkPoolWorker-12: Process ForkPoolWorker-5: Process ForkPoolWorker-9: Process ForkPoolWorker-1: Process ForkPoolWorker-2: Traceback (most recent call last): Traceback (most recent call last): Traceback (most recent call last): Process ForkPoolWorker-6: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) Process ForkPoolWorker-11: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() Traceback (most recent call last): Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 368, in get res = self._reader.recv_bytes() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 224, in recv_bytes buf = self._recv_bytes(maxlength) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 422, in _recv_bytes buf = self._recv(4) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 387, in _recv chunk = read(handle, remaining) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 368, in get res = self._reader.recv_bytes() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 224, in recv_bytes buf = self._recv_bytes(maxlength) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 422, in _recv_bytes buf = self._recv(4) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 387, in _recv chunk = read(handle, remaining) KeyboardInterrupt Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt Traceback (most recent call last): Traceback (most recent call last): Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt <-- Hang indefinitely ^C <-- Get tired of waiting and raise crtl-c Process ForkPoolWorker-14: Exception ignored in atexit callback: Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/util.py", line 334, in _exit_function _run_finalizers(0) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/util.py", line 300, in _run_finalizers finalizer() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/util.py", line 224, in __call__ res = self._callback(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 695, in _terminate_pool cls._help_stuff_finish(inqueue, task_handler, len(pool)) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 675, in _help_stuff_finish inqueue._rlock.acquire() KeyboardInterrupt: Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 368, in get res = self._reader.recv_bytes() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 224, in recv_bytes buf = self._recv_bytes(maxlength) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 422, in _recv_bytes buf = self._recv(4) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 387, in _recv chunk = read(handle, remaining) KeyboardInterrupt ```

The problem is challenging to recreate exactly, but it can be at least partially recreated by doing the following at the console:

Stack Trace ``` Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import pathos.pools >>> pool = pathos.pools.ProcessPool(20) >>> Process ForkPoolWorker-17: <-- Raise ctrl-c here Process ForkPoolWorker-13: Process ForkPoolWorker-20: Process ForkPoolWorker-18: Process ForkPoolWorker-16: Process ForkPoolWorker-19: Process ForkPoolWorker-9: Process ForkPoolWorker-7: Process ForkPoolWorker-11: Process ForkPoolWorker-5: Process ForkPoolWorker-15: KeyboardInterrupt >>> Process ForkPoolWorker-10: Traceback (most recent call last): Traceback (most recent call last): Process ForkPoolWorker-6: Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: Process ForkPoolWorker-14: Process ForkPoolWorker-4: Process ForkPoolWorker-3: Traceback (most recent call last): Process ForkPoolWorker-8: Process ForkPoolWorker-2: Process ForkPoolWorker-1: Process ForkPoolWorker-12: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() Traceback (most recent call last): KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt Traceback (most recent call last): Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() Traceback (most recent call last): KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() Traceback (most recent call last): Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() Traceback (most recent call last): Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: KeyboardInterrupt KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt KeyboardInterrupt KeyboardInterrupt Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 368, in get res = self._reader.recv_bytes() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 224, in recv_bytes buf = self._recv_bytes(maxlength) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 422, in _recv_bytes buf = self._recv(4) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/connection.py", line 387, in _recv chunk = read(handle, remaining) KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt Traceback (most recent call last): Traceback (most recent call last): Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt KeyboardInterrupt KeyboardInterrupt >>> <-- Hangs here <-- Ctrl-d (EOF) to to exit, hangs again <-- Ctrl-c again to exit ^CProcess ForkPoolWorker-36: Process ForkPoolWorker-35: Process ForkPoolWorker-32: Process ForkPoolWorker-33: Process ForkPoolWorker-31: Process ForkPoolWorker-34: Process ForkPoolWorker-39: Process ForkPoolWorker-37: Process ForkPoolWorker-29: Process ForkPoolWorker-27: Process ForkPoolWorker-38: Process ForkPoolWorker-26: Process ForkPoolWorker-30: Traceback (most recent call last): Traceback (most recent call last): Traceback (most recent call last): Exception ignored in atexit callback: Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/util.py", line 334, in _exit_function File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: Process ForkPoolWorker-23: Process ForkPoolWorker-25: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() Process ForkPoolWorker-24: KeyboardInterrupt Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) Process ForkPoolWorker-40: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt Process ForkPoolWorker-28: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() Process ForkPoolWorker-21: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt Traceback (most recent call last): Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt KeyboardInterrupt Traceback (most recent call last): Traceback (most recent call last): Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() Traceback (most recent call last): Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() Traceback (most recent call last): KeyboardInterrupt Traceback (most recent call last): KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: _run_finalizers(0) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/util.py", line 300, in _run_finalizers Process ForkPoolWorker-22: KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt KeyboardInterrupt finalizer() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/util.py", line 224, in __call__ res = self._callback(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 695, in _terminate_pool Traceback (most recent call last): cls._help_stuff_finish(inqueue, task_handler, len(pool)) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 675, in _help_stuff_finish File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() Traceback (most recent call last): KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt inqueue._rlock.acquire() KeyboardInterrupt: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt Traceback (most recent call last): File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 314, in _bootstrap self.run() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/pool.py", line 114, in worker task = get() File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/queues.py", line 367, in get with self._rlock: File "/home/oliver/.local/lib/python3.10/site-packages/multiprocess/synchronize.py", line 101, in __enter__ return self._semlock.__enter__() KeyboardInterrupt ```

In this latter example you'll notice that the pool has not even been given any work to do yet. My machine does not have 20 CPUs available, but the problem appears to be at least partially race-dependent, so the more processes that are started, the more likely for the bug to arise.

The problem can at least partially be avoided by calling terminate() on the pool before close:

Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pathos.pools
>>> pool = pathos.pools.ProcessPool(20)
>>> pool.terminate()
>>> 
KeyboardInterrupt
>>> 

But using a context manager has no effect:

Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pathos.pools
>>> with pathos.pools.ProcessPool(20) as pool:
...     pass
... 
>>> Process ForkPoolWorker-13:
(stack trace continues as above)
shakewingo commented 1 year ago

Thank god after wasting tons of time here, this threads help me target on the issue. I found it's really hard to track the trace after applying multiprocessing libs in python. Good luck to all of users.

mmckerns commented 1 year ago

I'm not experiencing the same thing, but I wonder if this has anything to do with a version mismatch for multiprocess and Python. multiprocessing has gone through a bunch of recent development on race and lock conditions, and I believe it may have made some of the minor Python releases somewhat incompatible with different versions of multiprocess. What I mean is when Python releases a new minor version, any changes to multiprocessing only need to work for the minor version of Python that it's released with. multiprocess, on the other hand, provides a fork of the latest multiprocessing... but one could very easily use it with an older minor version of Python.

In the past, I tested older minor versions of Python with multiprocess, and never found a version mismatch issue -- for years. I'm guessing that the recent semlock development in multiprocessing has made it so that I should really have variant paths in the multiprocess code for the different minor versions of Python. Until that happens, the workaround would be to upgrade your Python minor version to the version that was used in the multiprocess release. Saying that definitely tells me that if the workaround works, then code to distinguish between different minor versions is needed. I believe the latest release of multiprocess (from ~6 mo ago) assumes you are using Python 3.10.8. The pending release (days away) assumes you are using Python 3.10.11.

Let me know if the above applies.

oliver-s-lee commented 1 year ago

Hi @EvanMcKerns, thank's for the reply. I'm afraid I don't completely follow, but if it helps here's the versions I am using:

>>> multiprocess.__version__
'0.70.14'
>>> pathos.__version__
'0.3.0'
>>> sys.version
'3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0]'

Also tried a couple of fresh conda envs with both 3.10.8 and 3.10.11, sadly both experience the same problem.

mmckerns commented 1 year ago

@oliver-s-lee: I'm not experiencing this. Are you saying that you experience the same issue with python 3.10.8 and 3.10.11? I don't. I'd like to be able to reproduce your environment and the error you are experiencing. If I do a fresh VM check-out on a mac VM, of the versions you specify, I don't see anything like what you are reporting.

However, it's best if you update your python to at least 3.10.8, which was the python that the latest release of python multiprocess was built with.

oliver-s-lee commented 1 year ago

Hi Mike @mmckerns, yes I experience the same problem with python version 3.10.8, 3.10.11 and 3.10.6.

I am on a linux machine (Linux Mint 21.1 Cinnamon) however, I wonder if that is where the difference is coming from? if you need any other system information I can happily supply.