Open tsalo opened 2 years ago
Is this still a tedana issue or has it been resolved on the fMRIPrep side?
I think they recommended workarounds, but tedana still causes issues for some users. We could automatically switch from nonlinear fitting to log-linear when the low-memory option is enabled, so that fMRIPrep could pass its own low-memory settings to tedana?
I don't love the idea of a low-mem
option changing the underlying algorithm without clearly saying that. I also looked at the code and the nonlinear fitting should be slower, but I'm not sure why it requires non-trivially more memory. There's one thing that is an unnecessary memory waste.
https://github.com/ME-ICA/tedana/blob/e94b03a604283c608a518e8390713ddd29d52ea9/tedana/decay.py#L127-L128
is where the log linear fit is calculated as part of the nonlinear fit. t2s_limited
and s0_limited
are stored, but never used. They are just over-written at the end of the function. I wonder if we can get rid of this issue, simply by never saving these two unnecessary variables.
Beyond that, the function is processing data one voxel at a time, which is slow, but shouldn't use too much memory. There are a few arrays that have data for each voxel, including voxels outside the mask. Perhaps we can shrink the larger arrays to just voxels within the mask to also save memory.
It would be hard for me to test this since I'm not the one getting low-memory warnings, but I could try to reduce memory usage in the function, if you know someone else who is willing to test whether my solutions worked.
Just to drop some notes in an open issue, now that #995 has been merged:
It might be useful to sit down and see if we can take advantage of data proxying to keep data on-disk until the last possible second, and independent units of work that can make sure that a function that pulls data into memory pulls only what it needs and releases it.
One tool we use in nibabel is the ArrayProxy (exposed as img.dataobj
see - https://nipy.org/nibabel/images_and_memory.html). It's not a fully-featured drop-in replacement for an array, but it could get us most of the way there. I don't think I'll have time before Thanksgiving, but maybe a couple of us could find a few hours to hack on it?
cc @bpinsard
Another thought is to consider using a more fully-featured array-like provider such as Dask arrays. Xarray has written a wrapper xarray.Dataset.curvefit for scipy.optimize.curvefit
and can apply it to array-likes.
This could tie into @matthew-brett's nascent work on an xarray-based API for neuroimages, so I'll ping him here.
Summary
At least one user has reported issues with using fMRIPrep with low-memory processing on multi-echo data, as the T2* estimation step ends up being killed.
This stems from https://github.com/nipreps/fmriprep/issues/2728.
Additional Detail
Here's the traceback:
Next Steps
I'm thinking that we could extend the scope of the
low_memory
option to also trigger some kind of reduced memory approach to nonlinear T2* estimation.