cosanlab / nltools

Python toolbox for analyzing imaging data
https://nltools.org
MIT License
120 stars 42 forks source link

Stop ignoring mask argument when loading h5 brain data file #401

Closed ejolly closed 2 years ago

ejolly commented 2 years ago

Because h5 files are essentially Brain_Data objects, they save data along with a mask to disk. This differs from nifti files which don't contain a mask, just the imaging data.

Previously when loading an h5 file, the mask= argument in Brain_Data was entirely ignored regardless of whether the user tried to use it or not. We assumed instead that the h5 file contains a mask and we should load that instead.

This produced some situations that @TiankangXie encountered where data were in the correct space (e.g. 3mm), but the .mask attribute pointed to the default 2mm mask. As a result .to_nifti() throws an error because it tries to apply the wrong transformation to the data when converting.

Because we were ignoring the mask when loading h5 files, there was no easy fix to get out of this situation. So, this PR makes Brain_Data respect the mask argument when loading h5 files. Specifically it defaults to any mask contained within the h5 file when loaded (if there isn't one we let deepdish kick the error). When the user passes a mask argument, it will instead use that mask to learn the transformation similar to the behavior of mask when loading a nifti file. If we detect that there already exists a mask in the h5 file (which should almost always be true) when the user passes in a mask, then we issue a warning telling them we're ignoring the mask in the h5 on load.

Scenarios and behavior:

# User loads h5 that contains mask so that mask is used instead of the default MNI mask

Brain_Data('brain.h5')

# User loads h5 that contains mask but also sets mask argument.
# Now mask value takes precedence over whatever mask is in h5 
# so we issue a warning to the user letting them know on load

Brain_Data('brain.h5', mask='path/to/nifti/mask.nii.gz')

>>> UserWarning(...)

# User loads h5 that does NOT contain a mask and doesnt set the mask
# argument so the default MNI mask is used, similar to nifti files
# This is an implicit fallback just like with niftis 

Brain_Data('brain_nomask.h5')

# User loads h5 that does NOT contain mask but also sets mask argument
# Mask value is used to learn transformation like niftis
# No need to warn them about anything

Brain_Data('brain_nomask.h5', mask='path/to/nifti/mask.nii.gz')

@ljchang Can you take a quick look if this makes sense to you and I'm not missing other cases or expected behavior?

Also removed the file_name argument and attribute on Brain_Data we weren't using it anywhere except as a default fallback argument to .write() if the user didn't pass anything in (which they always should). It was causing problems with h5 files if that attribute didn't exist

ejolly commented 2 years ago

Now folded into #406