python / cpython

The Python programming language
https://www.python.org
Other
63.4k stars 30.36k forks source link

Multiprocessing imap hangs when generator input errors #70521

Closed cde5bb35-5b36-426c-9d47-400d85f9262f closed 8 years ago

cde5bb35-5b36-426c-9d47-400d85f9262f commented 8 years ago
BPO 26333
Nosy @terryjreedy, @applio

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = created_at = labels = ['invalid', 'type-bug', 'library'] title = 'Multiprocessing imap hangs when generator input errors' updated_at = user = 'https://bugs.python.org/AaronHalfaker' ``` bugs.python.org fields: ```python activity = actor = 'terry.reedy' assignee = 'none' closed = True closed_date = closer = 'terry.reedy' components = ['Library (Lib)'] creation = creator = 'Aaron Halfaker' dependencies = [] files = [] hgrepos = [] issue_num = 26333 keywords = [] message_count = 2.0 messages = ['260032', '260210'] nosy_count = 5.0 nosy_names = ['terry.reedy', 'jnoller', 'sbt', 'davin', 'Aaron Halfaker'] pr_nums = [] priority = 'normal' resolution = 'not a bug' stage = 'resolved' status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue26333' versions = ['Python 2.7', 'Python 3.5', 'Python 3.6'] ```

cde5bb35-5b36-426c-9d47-400d85f9262f commented 8 years ago

multiprocessing.imap will hang and not raise an error if an error occurs in the generator that is being mapped over. I'd expect the error to be raised and/or the process to fail.

For example, run the following code in python 2.7 or 3.4:

    from multiprocessing import Pool

    def add_one(v):
        return v+1

    pool = Pool(processes=2)

    values = ["1", "2", "3", "4", "foo", "5", "6", "7", "8"]
    value_iter = (int(v) for v in values)

    for new_val in pool.imap(add_one, value_iter):
        print(new_val)

And output should look something like this:

    $ python demo_hanging.py 
    2
    3
    4
    5
    Exception in thread Thread-2:
    Traceback (most recent call last):
      File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner
        self.run()
      File "/usr/lib/python3.4/threading.py", line 868, in run
        self._target(*self._args, **self._kwargs)
      File "/usr/lib/python3.4/multiprocessing/pool.py", line 378, in _handle_tasks
        for i, task in enumerate(taskseq):
      File "/usr/lib/python3.4/multiprocessing/pool.py", line 286, in <genexpr>
        self._taskqueue.put((((result._job, i, func, (x,), {})
      File "demo_hanging.py", line 9, in <genexpr>
        value_iter = (int(v) for v in values)
    ValueError: invalid literal for int() with base 10: 'foo'

The script will then hang indefinitely.

terryjreedy commented 8 years ago

If you add the "if __name == '__main':" guard after defining the target function, as specified in the multiprocessing doc, you will get a traceback much as you expect:

Traceback (most recent call last):
  File "F:\Python\mypy\tem.py", line 12, in <module>
    for new_val in pool.imap(add_one, value_iter):
  File "C:\Programs\Python35\lib\multiprocessing\pool.py", line 695, in next
    raise value
  File "C:\Programs\Python35\lib\multiprocessing\pool.py", line 380, in _handle_tasks
    for i, task in enumerate(taskseq):
  File "C:\Programs\Python35\lib\multiprocessing\pool.py", line 286, in <genexpr>
    self._taskqueue.put((((result._job, i, func, (x,), {})
  File "F:\Python\mypy\tem.py", line 10, in <genexpr>
    value_iter = (int(v) for v in values)
ValueError: invalid literal for int() with base 10: 'foo'

I have seem this bug of omission multiple times on Stackoverflow.