Open jakirkham opened 5 years ago
Hot take: it looks like an upstream "issue":
The Exception
class overrides the classic object
__reduce__
method. But the reducer is fairly simple, and forgets attributes that are not in the object's __dict__
(at the exception of args
, the arguments given at the exception creation time). In particular, the __cause__
attribute, that contains the direct cause of the exception (when raised), gets lost in the process.
Why is this bad: the problem here happens at Exception
instance pickling, and not class pickling. However, pickling instances is usually taken care of by pickle
and not cloudpickle
itself, that does not override or extends that part of the code. There was a similar issue happening with property
instances, that we ended up closing.
It's true this can be reproduced using builtin pickle
alone.
FWICT pickling support for Exception
's has grown organically as demonstrated by this CPython issue, which led to the introduction of args
for pickling. So not pickling __cause__
, __context__
, and __suppress_context__
is likely an oversight as opposed to intended behavior.
Sorry for the quick follow-up post. Just wanted to add that upstream is aware of the loss of these attributes during pickling as well as other things lost (like __slots__
).
Now that we have reducer_override
, we can have our own custom reducer for any exceptions, so we can actually fix this for python>=3.8
That's certainly a way to fix it and a good option have. Given it is an acknowledged bug in CPython, should we just fix it in CPython?
We can. But if we want to backport it to earlier python versions, we may have to move forward with a custom reducer in cloudpickle
. Opinions welcome, aslo pinging @ogrisel
That makes sense.
+1 for fixing upstream first (and then backport if easy to backport).
it seems like traceback info is also lost during pickle and unpickle
import cloudpickle
def main():
outer()
def outer():
boom()
def boom():
# pass
raise RuntimeError("boom")
try:
main()
except Exception as e:
pe = e
raise pe
run the script, get
$ python /private/home/jieru/workspace/hydra/repro/cp.py
Traceback (most recent call last):
File "/private/home/jieru/workspace/hydra/repro/cp.py", line 22, in <module>
raise pe
File "/private/home/jieru/workspace/hydra/repro/cp.py", line 17, in <module>
main()
File "/private/home/jieru/workspace/hydra/repro/cp.py", line 4, in main
outer()
File "/private/home/jieru/workspace/hydra/repro/cp.py", line 8, in outer
boom()
File "/private/home/jieru/workspace/hydra/repro/cp.py", line 13, in boom
raise RuntimeError("boom")
RuntimeError: boom
try:
main()
except Exception as e:
pe = cloudpickle.dumps(e)
raise cloudpickle.loads(pe)
run the script
$ python /private/home/jieru/workspace/hydra/repro/cp.py
Traceback (most recent call last):
File "/private/home/jieru/workspace/hydra/repro/cp.py", line 21, in <module>
raise cloudpickle.loads(pe)
RuntimeError: boom
Python 3.8.11
cloudpickle 1.6.0
@jieru-hu, yep this was noted above (and in the subsequent comment)
Hello @jakirkham, @pierreglaser,
Would something like
def _exception_reduce(obj):
tpl = obj.__reduce__()
return (_make_exception, tpl[:2] + (obj.__cause__, obj.__traceback__)) + tpl[2:]
def _make_exception(fun, args, cause, traceback):
obj = fun(*args)
obj.__cause__ = cause
obj.__traceback__ = traceback
return obj
work? I don't know where to put it in cloudpickle
's code to test.
I'm using the following monkey-patch to cloudpickle:
def reduce_exception(e):
reduced = e.__reduce__()
if len(reduced) < 3:
reduced += ({},)
for attribute in (
"__cause__",
"__context__",
"__traceback__",
"__suppress_context__",
):
if getattr(e, attribute) is not None:
reduced[2][attribute] = getattr(e, attribute)
return reduced
def reducer_override(self, obj):
if isinstance(obj, Exception):
return reduce_exception(obj)
return original_reducer_override(self, obj)
original_reducer_override = cloudpickle.CloudPickler.reducer_override
cloudpickle.CloudPickler.reducer_override = reducer_override
Hello @ender-wieczorek, thanks for the snippet, but this does not work for me as Traceback
objects are not pickable.
You're right, I posted too soon. Here's a solution for Traceback
s:
It seems that a round trip through cloudpickle results in chained exceptions being dropped. Here's an example below.
xref: https://github.com/dask/dask/issues/4384
cc @stuarteberg