python / cpython

The Python programming language
https://www.python.org
Other
62.29k stars 29.93k forks source link

Cannot use ProcessPoolExecutor if in a decorator? #85749

Open f76f6f4a-3fc7-40d7-9a30-4bd2b7cb0be2 opened 4 years ago

f76f6f4a-3fc7-40d7-9a30-4bd2b7cb0be2 commented 4 years ago
BPO 41577
Nosy @pitrou, @avassalotti, @aeros

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.7', '3.8', 'type-bug', 'library', '3.9'] title = 'Cannot use ProcessPoolExecutor if in a decorator?' updated_at = user = 'https://bugs.python.org/bobfanglondon' ``` bugs.python.org fields: ```python activity = actor = 'aeros' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'bob.fang.london' dependencies = [] files = [] hgrepos = [] issue_num = 41577 keywords = [] message_count = 2.0 messages = ['375614', '375632'] nosy_count = 4.0 nosy_names = ['pitrou', 'alexandre.vassalotti', 'aeros', 'bob.fang.london'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue41577' versions = ['Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9'] ```

f76f6f4a-3fc7-40d7-9a30-4bd2b7cb0be2 commented 4 years ago

I have this minimal example:

from functools import wraps
from concurrent import futures
import random

def decorator(func):
    num_process = 4

    def impl(*args, **kwargs):
        with futures.ProcessPoolExecutor() as executor:
            fs = []
            for i in range(num_process):
                fut = executor.submit(func, *args, **kwargs)
                fs.append(fut)
            result = []
            for f in futures.as_completed(fs):
                result.append(f.result())
        return result
    return impl

@decorator
def get_random_int():
    return random.randint(0, 100)

if __name__ == "__main__":
    result = get_random_int()
    print(result)

If we try to run this function I think we will have the following error:

_pickle.PicklingError: Can't pickle <function get_random_int at 0x7f06cee666a8>: it's not the same object as __main__.get_random_int

I think the main issue here is that the "wraps" decorator itself alters the func object and thus make it impossible to pickle. I found this rather strange. I am just wondering if there is any way to get around this behavior? I would want to use wraps if possible. Thanks!

aeros commented 4 years ago

Due to the way pickle works, it's not presently possible to serialize wrapped functions directly, at least not in a way that allows you to pass it to executor.submit() within the inner function (AFAICT). I'm also not certain what it would involve to provide that, or if it would be feasible to do so in a manner that would be backwards compatible.

In the meantime, this is a decent work-around:

from concurrent import futures
import random

class PPEWrapper:
    def __init__(self, func, num_proc=4):
        self.fn = func
        self.num_proc = num_proc

    def __call__(self, *args, **kwargs):
        with futures.ProcessPoolExecutor() as executor:
            fs = []
            for i in range(self.num_proc):
                fut = executor.submit(self.fn, *args, **kwargs)
                fs.append(fut)
            result = []
            for f in futures.as_completed(fs):
                result.append(f.result())
        return result

def _get_random_int():
    return random.randint(0, 100)

# it's often quite useful anyways to have a parallel and non-parallel version
# (for testing and devices that don't support MP)
get_random_int = PPEWrapper(_get_random_int, num_proc=4)

if __name__ == "__main__":
    result = get_random_int()
    print(result)

This doesn't allow you to use the decorator syntax, but it largely provides the same functionality. That being said, my familiarity with the pickle protocol isn't overly strong, so the above is mostly based on my own recent investigation. There could very well be a way to accomplish what you're looking for in a way that I was unable to determine.