nipy / nipype

Workflows and interfaces for neuroimaging packages
https://nipype.readthedocs.org/en/latest/
Other
741 stars 523 forks source link

Concurrent futures MultiProc backend failing to constrain resources #2700

Open effigies opened 5 years ago

effigies commented 5 years ago

Summary

We've recently had 3 separate reports that fMRIPrep is running into memory allocation errors, since moving to the new MultiProc backend, based on concurrent.futures. In at least two of the cases, setting the plugin to LegacyMultiProc resolved the issue.

This suggests that the measures put into place to reduce the memory footprint of subprocesses no longer work for concurrent.futures.

Related: poldracklab/fmriprep#1259

dPys commented 5 years ago

Hi @effigies , I can also confirm that these errors do not appear to be specific to fmriprep either:

When running pynets on HPC using a forkserver and restricted to the available resources on a single compute node:

exception calling callback for <Future at 0x2adf42c4d7f0 state=finished raised BrokenProcessPool>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/concurrent/futures/_base.py", line 324, in _invoke_callbacks
    callback(self)
  File "/opt/conda/lib/python3.6/site-packages/nipype/pipeline/plugins/multiproc.py", line 143, in _async_callback
    result = args.result()
  File "/opt/conda/lib/python3.6/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()
  File "/opt/conda/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

This tends to occur at varying points during the workflow depending on the resources allocated to MultiProc at runtime.

As you noted, using LegacyMultiProc appears to resolve the issue.

-Derek