Ability to join() threads in concurrent.futures.ThreadPoolExecutor

fff18ee7-fc0b-4e9d-870b-c91cf7e646e8 commented 10 years ago

BPO	22361
Nosy	@brianquinlan, @MojoVampire

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-feature', 'library'] title = 'Ability to join() threads in concurrent.futures.ThreadPoolExecutor' updated_at = user = 'https://bugs.python.org/dktrkranz' ``` bugs.python.org fields: ```python activity = actor = 'bquinlan' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'dktrkranz' dependencies = [] files = [] hgrepos = [] issue_num = 22361 keywords = [] message_count = 4.0 messages = ['226569', '226615', '226629', '341629'] nosy_count = 3.0 nosy_names = ['bquinlan', 'dktrkranz', 'josh.r'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue22361' versions = ['Python 3.4'] ```

fff18ee7-fc0b-4e9d-870b-c91cf7e646e8 commented 10 years ago

I have a program which waits for external events (mostly pyinotify events), and when events occur a new worker is created using concurrent.futures.ThreadPoolExecutor. The following snippet represents shortly what my program does:

from time import sleep
from concurrent.futures import ThreadPoolExecutor

def func():
    print("start")
    sleep(10)
    print("stop")

ex = ThreadPoolExecutor(1)

# New workers will be scheduled when an event
# is triggered (i.e. pyinotify events)
ex.submit(func)

# Dummy sleep
sleep(60)

When func() is complete, I'd like the underlying thread to be terminated. I realize I could call ex.shutdown() to achieve this, but this would prevent me from adding new workers in case new events occur. Not calling ex.shutdown() leads to have unfinished threads which pile up considerably:

(gdb) run test.py Starting program: /usr/bin/python3.4-dbg test.py [Thread debugging using libthread_db enabled] [New Thread 0x7ffff688e700 (LWP 17502)] start stop ^C Program received signal SIGINT, Interrupt. 0x00007ffff6e41963 in select () from /lib/x86_64-linux-gnu/libc.so.6 (gdb) info threads Id Target Id Frame 2 Thread 0x7ffff688e700 (LWP 17502) "python3.4-dbg" 0x00007ffff7bce420 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0

1 Thread 0x7ffff7ff1700 (LWP 17501) "python3.4-dbg" 0x00007ffff6e41963 in select () from /lib/x86_64-linux-gnu/libc.so.6 (gdb)

Would it be possible to add a new method (or a ThreadPoolExecutor option) which allows to join the underlying thread when the worker function returns?

99ffcaa5-b43b-4e8e-a35e-9c890007b9cd commented 10 years ago

Can you explain what benefit this would provide? Forcing the thread to exit gets you relatively little benefit. If it's an infrequently used executor, I suppose you avoid the cost of leaving worker threads blocked waiting for work, but that cost is tiny, and you pay for it with increased overhead to dispatch new tasks since they have to create new threads instead of using existing worker threads.

fff18ee7-fc0b-4e9d-870b-c91cf7e646e8 commented 10 years ago

There is indeed little benefit in freeing up resources left open by a unused thread, but it could be worth closing it for specific needs (e.g. thread processes sensible information) or in embedded systems with very low resources.

brianquinlan commented 5 years ago

So you actually use the result of ex.submit i.e. use the resulting future?

If you don't then it might be easier to just create your own thread.

python / cpython

Ability to join() threads in concurrent.futures.ThreadPoolExecutor #66557