Closed JSKenyon closed 3 years ago
I have found a solution but will document this here. This is the traceback:
Traceback (most recent call last):
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/distributed/batched.py", line 93, in _background_send
nbytes = yield self.comm.write(
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/tornado/gen.py", line 762, in run
value = future.result()
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/distributed/comm/tcp.py", line 243, in write
frames = await to_frames(
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/distributed/comm/utils.py", line 50, in to_frames
return _to_frames()
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/distributed/comm/utils.py", line 33, in _to_frames
return list(protocol.dumps(msg, **kwargs))
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/distributed/protocol/core.py", line 76, in dumps
frames[0] = msgpack.dumps(msg, default=_encode_default, use_bin_type=True)
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/msgpack/__init__.py", line 35, in packb
return Packer(**kwargs).pack(o)
File "msgpack/_packer.pyx", line 292, in msgpack._cmsgpack.Packer.pack
File "msgpack/_packer.pyx", line 298, in msgpack._cmsgpack.Packer.pack
File "msgpack/_packer.pyx", line 295, in msgpack._cmsgpack.Packer.pack
File "msgpack/_packer.pyx", line 264, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 264, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 285, in msgpack._cmsgpack.Packer._pack
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/distributed/protocol/core.py", line 57, in _encode_default
sub_header, sub_frames = serialize_and_split(
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 425, in serialize_and_split
header, frames = serialize(x, serializers, on_error, context)
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 251, in serialize
return serialize(
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 297, in serialize
headers_frames = [
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 298, in <listcomp>
serialize(
File "/home/jonathan/venvs/qcenv/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 349, in serialize
raise TypeError(msg, str(x)[:10000])
TypeError: ('Could not serialize object of type tuple.', '(subgraph_callable-464cf43f-bd73-4c4f-b3c2-636d62260b93, (<function concatenate_axes at 0x7f92de232310>, [["(\'stack-1b1f98c2f3e26614aeffec434f095767\', 0, 0, 0)"]], [0, 2]), (<function concatenate_axes at 0x7f92de232310>, [["(\'stack-eeda88356e5cd160e293990558bb7e20\', 0, 0, 0)"]], [0, 2]), array([[0]], dtype=int32), (<class \'tuple\'>, [106, 64, 28, 1, 4]), 4, "(\'G-gain-c261af76dbb760d439f93f9840240dca\', 0, 0, 0, 0, 0)")')
The actual error (I went and found it) is:
*** _pickle.PicklingError: Could not pickle object as excessively deep recursion required.
This stems from the following blockwise
call:
It seems that combine_gains
(which is a Numba function created with generated_jit
) somehow confuses the pickling. It seems that the problem goes away by wrapping the Numba function with a Python function.
combine_gains
can return a lambda which is not pickleable.
Wrong branch. However, removing the call to coerce_literals
makes combine_gains
pickleable.
Closing - I am wrapping the function for now. If that begins failing, I can return to the old behaviour which didn't use coerce_literals
.
As it says in the title,
output.net_gain=True
causes problems (specifically with respect to pickling) when using multiple Dask workers. This shouldn't be difficult to fix, this issue is just to serve as a warning/reminder.