python / cpython

The Python programming language
https://www.python.org
Other
62.26k stars 29.91k forks source link

Deadlock when shutting down ThreadPoolExecutor from inside OS Signal handler #121649

Open bostrt opened 2 months ago

bostrt commented 2 months ago

Bug report

Bug description:

When running the code below and sending a SIGTERM to the process results in what looks like a deadlock on the ThreadPoolExecutor._shutdown_lock. I guess the issue is that the SIGTERM is being handled in main thread while in the submit function where the _shutdown_lock is already locked. When I try calling shutdown inside the signal handler it waits forever.

TBH I'm not sure if this is a bug or expected behavior but I wanted to file this anyways. Thanks!

from concurrent.futures import ThreadPoolExecutor
import signal

def main():
    with ThreadPoolExecutor(max_workers=2) as executor:

        def __exit(signal, frame):
            executor.shutdown(wait=True, cancel_futures=True)

        signal.signal(signal.SIGTERM, __exit)

        while True:
            executor.submit(lambda: 1 + 1)

if __name__ == "__main__":
    main()

AFTER sending SIGTERM to the process, this is what pystack shows:

(v) rbost@fedora:~/code/CheckDuplicate$ pystack remote 3431749
Traceback for thread 3431751 (python) [] (most recent call last):
    (Python) File "/usr/lib64/python3.12/threading.py", line 1030, in _bootstrap
        self._bootstrap_inner()
    (Python) File "/usr/lib64/python3.12/threading.py", line 1073, in _bootstrap_inner
        self.run()
    (Python) File "/usr/lib64/python3.12/threading.py", line 1010, in run
        self._target(*self._args, **self._kwargs)
    (Python) File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 89, in _worker
        work_item = work_queue.get(block=True)

Traceback for thread 3431750 (python) [] (most recent call last):
    (Python) File "/usr/lib64/python3.12/threading.py", line 1030, in _bootstrap
        self._bootstrap_inner()
    (Python) File "/usr/lib64/python3.12/threading.py", line 1073, in _bootstrap_inner
        self.run()
    (Python) File "/usr/lib64/python3.12/threading.py", line 1010, in run
        self._target(*self._args, **self._kwargs)
    (Python) File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 89, in _worker
        work_item = work_queue.get(block=True)

Traceback for thread 3431749 (python) [] (most recent call last):
    (Python) File "/home/rbost/code/CheckDuplicate/test.py", line 18, in <module>
        main()
    (Python) File "/home/rbost/code/CheckDuplicate/test.py", line 14, in main
        executor.submit(lambda: 1 + 1)
    (Python) File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 175, in submit
        f = _base.Future()
    (Python) File "/usr/lib64/python3.12/concurrent/futures/_base.py", line 328, in __init__
        def __init__(self):
    (Python) File "/home/rbost/code/CheckDuplicate/test.py", line 9, in __exit
        executor.shutdown(wait=True, cancel_futures=True)
    (Python) File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 220, in shutdown
        with self._shutdown_lock:

CPython versions tested on:

3.12

Operating systems tested on:

Linux

bostrt commented 2 months ago

For what its worth, I workaround this like using a var to track if signal has been received:

from concurrent.futures import ThreadPoolExecutor
import signal
import threading

def main():
    with ThreadPoolExecutor(max_workers=2) as executor:

        exit_signal = threading.Event()

        def __exit(signal, frame):
            exit_signal.set()

        signal.signal(signal.SIGTERM, __exit)

        while True:
            if exit_signal.is_set():
                executor.shutdown(wait=True, cancel_futures=True)
                break
            executor.submit(lambda: 1 + 1)

if __name__ == "__main__":
    main()