Closed dPys closed 5 years ago
Have you tried with other versions of Python?
Interesting question @effigies . In Python2.7, the workflow appears to just hang actually, and on an earlier node of the workflow.
Two further observations-- 1) there are as many state errors that emerge as there are iterables used (see case of a two-process traceback below where it repeats). 2) The issue did not go away when I restructured the nodes to only pass file paths (i.e. .trk streamline files), as opposed to passing nibabel streamlines objects themselves.
exception calling callback for <Future at 0x104858400 state=finished raised error>
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/anaconda3/lib/python3.7/concurrent/futures/process.py", line 198, in _sendback_result
exception=exception))
File "/usr/local/anaconda3/lib/python3.7/multiprocessing/queues.py", line 364, in put
self._writer.send_bytes(obj)
File "/usr/local/anaconda3/lib/python3.7/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/local/anaconda3/lib/python3.7/multiprocessing/connection.py", line 393, in _send_bytes
header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/anaconda3/lib/python3.7/concurrent/futures/_base.py", line 324, in _invoke_callbacks
callback(self)
File "/usr/local/anaconda3/lib/python3.7/site-packages/nipype-2.0.0.dev0+gd9976942c-py3.7.egg/nipype/pipeline/plugins/multiproc.py", line 149, in _async_callback
result = args.result()
File "/usr/local/anaconda3/lib/python3.7/concurrent/futures/_base.py", line 425, in result
return self.__get_result()
File "/usr/local/anaconda3/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
Right now I'm trying to eliminate the passing of any/all dipy objects (e.g a gradient table) to see if that makes a difference.
Let me know if you have further ideas!
Thanks, @dPys
Fix failed :/
Here's the DAG if it helps at all to conceptualize what's going on:
Solved. Because DWI data matrices are 4d, some can exceed 4GB and can't be serialized. Closing this now, but thanks for the help @effigies !
Has anyone ever come across the following type of struct error with nipype's multiproc when running a workflow? (and in particular one that runs with a forkserver)?
This seems to occur both with MultiProc and LegacyMultiProc on nipype 1.1.9 and dev/2.0. Linear execution works fine.
The problem appears to occur when any iterable is used on this workflow. My hunch is that it is related to a serialization issue (in particular connecting/pickling high memory dipy objects across nipype nodes), but I've been testing a number of possibilities. If no obvious solution comes to mind, please let me know and I can provide a minimal example with an accompanying docker container!
Cheers, @dPys