Open joshdk opened 7 years ago
Thanks for your gist, it's a nice collections of tests. If you think it's a pickling issue, then the thing to do is to check if dill
can serialize the objects -- see dill.check
and dill.copy
.
In python2.7, I see this:
>>> import dill
>>> dill.dill._trace(True) # turn on pickle tracing
>>>
>>> class CustomError3(Exception):
... def __init__(self):
... msg = 'oops'
... Exception.__init__(self, msg)
...
>>> exception = CustomError3()
>>> dill.check(exception)
T2: <class '__main__.CustomError3'>
F2: <function _create_type at 0x10ec206e0>
# F2
T1: <type 'type'>
F2: <function _load_type at 0x10ec20668>
# F2
# T1
T4: <type 'exceptions.Exception'>
# T4
D2: <dict object at 0x10ed01398>
F1: <function __init__ at 0x10ed46aa0>
F2: <function _create_function at 0x10ec20758>
# F2
Co: <code object __init__ at 0x10e835db0, file "<stdin>", line 2>
T1: <type 'code'>
# T1
# Co
D1: <dict object at 0x10e75a168>
# D1
D2: <dict object at 0x10ed03168>
# D2
# F1
# D2
# T2
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/mmckerns/lib/python2.7/site-packages/dill-0.2.7.dev0-py2.7.egg/dill/dill.py", line 278, in loads
return load(file)
File "/Users/mmckerns/lib/python2.7/site-packages/dill-0.2.7.dev0-py2.7.egg/dill/dill.py", line 267, in load
obj = pik.load()
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 864, in load
dispatch[key](self)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1139, in load_reduce
value = func(*args)
TypeError: __init__() takes exactly 1 argument (2 given)
...which is exactly what is seen in the traceback from the Pool.map
before it hangs.
From the pickle trace above, you can see that it appears to complete the dump
(starts at T2
and ends at #T2
), but fails on load
due to the bad number of arguments in calling the parent.
I don't think it has anything to do with the class being a custom exception... and more that it's throwing an error due to the bad number of arguments when trying to create an instance.
Should throwing an exception cause a Pool.map
to hang? Probably not. So there seems that there is some mishandling in there somewhere. I'll take some investigation. I'd imagine this has been seen in multiprocessing
before as well.
Thank you for looking into this! I completely agree with what you observed.
I'd imagine this has been seen in
multiprocessing
before as well.
I had to modify the test slightly, but I can confirm that this behavior is identical in multiprocessing
(both the TypeError
and the Python-2/3-specific hanging behavior).
from nose.plugins.skip import SkipTest
from nose.tools import assert_raises
from multiprocessing import Pool
class CustomError3(Exception):
def __init__(self):
msg = 'Something went wrong.'
Exception.__init__(self, msg)
def callback(_):
return CustomError3()
def test_5():
"""Test returning an extension that passes an additional argument to standard Exception."""
error = CustomError3()
assert isinstance(error, Exception)
results = pool.map(callback, ['A', 'B', 'C'])
assert len(results) == 3
pool = Pool(1)
I suspect that the behavior in multiprocess
may be inherited from the original multiprocessing
fork, but I would still be very interested in seeing the results of your investigation.
Observed this behavior when
Pool.map
ing a callback that either returned a value or a custom exception type. Returning certain types of exceptions will cause thePool.map
call to never return (hang), and in the case of Python 2.7, you are unable toSIGINT
the process (possibly because something handlesKeyboardInterrupt
).I believe that the mishandling of these custom exceptions is caused by
pickle
internally, but there is still the issue thatPool.map
does not return when this occurs.I have written some Nose tests in a Gist to reproduce this issue, and to determine exactly which cases trigger this issue. The tests that are decorated with
@skip
will cause the tests to hang if they are run.I am using Python version
2.7.10
/3.5.2
, andmultiprocess
version0.70.5
.TLDR; If, from a
Pool.map
callback, you return a custom exception whose constructor is of arityn
, but you passn + 1
arguments to that exception's parent constructor,Pool.map
will never return, and you may need tokill
Python to regain control.