[ENH] avoid loading whole nifti-like image data when extracting time-series within a mask

MengxingLiu commented 2 days ago

Is there an existing issue for this?

[x] I have searched the existing issues

Describe your proposed enhancement in detail.

I am using nilearn to mask a large nifti using the following command:

mask_nii = nib.load(mask)
self.mask_img = new_img_like(mask_nii, get_data(mask_nii)==ROI_id)
self.masker = NiftiMasker(mask_img=self.mask_img)
residual_img = nib.load(errts_file)
self.residual = self.masker.fit_transform(residual_img)

It looks like when masking nilearn loads the entire image into memory, which can cause memory issue when the residual image is big. I tracked the memory usage, it looks like it’s only at the step of

self.residual = self.masker.fit_transform(residual_img) nilearn loads the residual image into memory.

Would it be possible to implement a feature that could control loading whole image or not when only needing to extract the time series within a small mask?

Benefits to the change

It will decrease the memory usage and potentially allow more parallel jobs running at the same time

Pseudocode for the new behavior, if applicable

No response

bthirion commented 1 day ago

I don't see how to do that with the masker we have. If your image is 4D you can mask the 3D images sequentially to reduce memory load.

man-shu commented 1 day ago

Some more context: We had this discussion with @MengxingLiu during the drop-in hour yesterday. We think it might be possible to use NifitLabelsMasker (where labels are basically several masks). However, while NifitLabelsMasker always aggregates values from voxels in a labeled region (via strategy parameter), @MengxingLiu would like the output to be available without any aggregation, meaning all voxel vaules should still be available after the transform.

One solution could be implementing a new strategy where no aggregation happens. NifitLabelsMasker uses a subset of the aggregation strategies from scipy ndimage: https://docs.scipy.org/doc/scipy/reference/ndimage.html.

Possible solution: value_indices is one strategy I think could be used for this.

Possible caveat: The output signal would be 3D instead of the usual 2D -- handling that might be an issue.

emdupre commented 1 day ago

I'm personally against having no aggregation in NiftiLabelsMasker ; this has come up before and been rejected (I think for good reasons). It also seems (to me) like it doesn't get to the core of the issue re : performance concerns.

I'd propose instead to subsample the mask (e.g., 4D to 3D as BT suggested). I'm also a little confused about this line:

self.mask_img = new_img_like(mask_nii, get_data(mask_nii)==ROI_id)

as it looks like this larger mask is already being subset by ROI, theoretically as NiftiLabelsMasker would do ? I do think we want to scale to larger data, as possible, so I'd be curious to know more about this exact use case !

MengxingLiu commented 18 hours ago

I'm personally against having no aggregation in NiftiLabelsMasker ; this has come up before and been rejected (I think for good reasons). It also seems (to me) like it doesn't get to the core of the issue re : performance concerns.

I'd propose instead to subsample the mask (e.g., 4D to 3D as BT suggested). I'm also a little confused about this line:
self.mask_img = new_img_like(mask_nii, get_data(mask_nii)==ROI_id)
as it looks like this larger mask is already being subset by ROI, theoretically as NiftiLabelsMasker would do ? I do think we want to scale to larger data, as possible, so I'd be curious to know more about this exact use case !

Hi, thanks for checking this out.

NiftiLabelsMasker is averaging across voxels within the Mask. What I am trying to achieve here is to extract the time series within a mask without averaging, the shape of the matrix is expected to be number of scans by number of voxels.

bthirion commented 14 hours ago

IIUC you can achieve that with the good old NiftiMasker.

Remi-Gau commented 8 hours ago

related 'old' issue: https://github.com/nilearn/nilearn/issues/719

Remi-Gau commented 8 hours ago

see also https://github.com/nilearn/nilearn/issues/1883

Remi-Gau commented 8 hours ago

and https://github.com/nilearn/nilearn/issues/1754

nilearn / nilearn