Masked images - as axis coordinates or separate mask array?

matthew-brett commented 3 months ago

I have been playing with the idea of masked images.

We frequently have a brain mask, and we want to store only those voxels within the mask.

We could have an alternative image arrangement, where there are one or two axes.

For one axis, this will be "voxels". For two, it would be "voxels", and e.g. "time".

The axis coordinates for voxels would be voxel coordinates - details below.

Call the not-masked (original) number of voxels n_orig. Call the number of masked voxels n_masked.

Usually a masked image has in the order of 40% of voxels selected: n_masked ~ n_orig * 0.4.

The axis coordinates for "voxels" would then be size (n_masked, 3).

Say sizeof int is 8.

The overhead for storing a masked image is therefore going to be around n_orig * 0.4 * 3 * 8. Meanwhile we will be saving (n_orig 0.6 sizeof e.g float for 1D data, thus n_orig * 9.6 cost vs n_orig * 4.8 benefit = net cost factor of -4.8.

However the cost remains the same with more elements on the time axis, but the savings increase in proportion to the length of the time axis. So that's level for time axis length 2, and net saving for all subsequent elements on the time axis.

I guess we could compare this to storing the mask boolean image with the standard ximg array, maybe as a xib-mask attribute, this would add something of size n_orig * sizeof(bool) - so smaller - but less standardized.

Thoughts?

matthew-brett commented 3 months ago

It then occurred to me that the mask image (as opposed to the voxel coordinates) has the original shape. We could attach the shape attribute to that ijk (voxel) axis, I guess.

matthew-brett commented 3 months ago

Noting that there does not seem a way to do 3D Boolean indexing to an Xarray:

In [10]: ximg[mask_ximg]
...
IndexError: 3-dimensional boolean indexing is not supported.

We can use ximg.where(mask_ximg) but this just fills the masked values with np.nan. Xarray does support sparse arrays, as another option. Or we can implement our own masking with ximg.xi.mask(mask_array).

matthew-brett / xibabel

Masked images - as axis coordinates or separate mask array? #22