Because h5 files are essentially Brain_Data objects, they save data along with a mask to disk. This differs from nifti files which don't contain a mask, just the imaging data.
Previously when loading an h5 file, the mask= argument in Brain_Data was entirely ignored regardless of whether the user tried to use it or not. We assumed instead that the h5 file contains a mask and we should load that instead.
This produced some situations that @TiankangXie encountered where data were in the correct space (e.g. 3mm), but the .mask attribute pointed to the default 2mm mask. As a result .to_nifti() throws an error because it tries to apply the wrong transformation to the data when converting.
Because we were ignoring the mask when loading h5 files, there was no easy fix to get out of this situation. So, this PR makes Brain_Data respect the mask argument when loading h5 files. Specifically it defaults to any mask contained within the h5 file when loaded (if there isn't one we let deepdish kick the error). When the user passes a mask argument, it will instead use that mask to learn the transformation similar to the behavior of mask when loading a nifti file. If we detect that there already exists a mask in the h5 file (which should almost always be true) when the user passes in a mask, then we issue a warning telling them we're ignoring the mask in the h5 on load.
Scenarios and behavior:
# User loads h5 that contains mask so that mask is used instead of the default MNI mask
Brain_Data('brain.h5')
# User loads h5 that contains mask but also sets mask argument.
# Now mask value takes precedence over whatever mask is in h5
# so we issue a warning to the user letting them know on load
Brain_Data('brain.h5', mask='path/to/nifti/mask.nii.gz')
>>> UserWarning(...)
# User loads h5 that does NOT contain a mask and doesnt set the mask
# argument so the default MNI mask is used, similar to nifti files
# This is an implicit fallback just like with niftis
Brain_Data('brain_nomask.h5')
# User loads h5 that does NOT contain mask but also sets mask argument
# Mask value is used to learn transformation like niftis
# No need to warn them about anything
Brain_Data('brain_nomask.h5', mask='path/to/nifti/mask.nii.gz')
@ljchang Can you take a quick look if this makes sense to you and I'm not missing other cases or expected behavior?
Also removed the file_name argument and attribute on Brain_Data we weren't using it anywhere except as a default fallback argument to .write() if the user didn't pass anything in (which they always should). It was causing problems with h5 files if that attribute didn't exist
Because h5 files are essentially
Brain_Data
objects, they save data along with a mask to disk. This differs from nifti files which don't contain a mask, just the imaging data.Previously when loading an h5 file, the
mask=
argument inBrain_Data
was entirely ignored regardless of whether the user tried to use it or not. We assumed instead that the h5 file contains a mask and we should load that instead.This produced some situations that @TiankangXie encountered where data were in the correct space (e.g. 3mm), but the
.mask
attribute pointed to the default 2mm mask. As a result.to_nifti()
throws an error because it tries to apply the wrong transformation to the data when converting.Because we were ignoring the
mask
when loading h5 files, there was no easy fix to get out of this situation. So, this PR makesBrain_Data
respect themask
argument when loading h5 files. Specifically it defaults to any mask contained within the h5 file when loaded (if there isn't one we let deepdish kick the error). When the user passes amask
argument, it will instead use that mask to learn the transformation similar to the behavior ofmask
when loading a nifti file. If we detect that there already exists a mask in the h5 file (which should almost always be true) when the user passes in amask
, then we issue a warning telling them we're ignoring the mask in the h5 on load.Scenarios and behavior:
@ljchang Can you take a quick look if this makes sense to you and I'm not missing other cases or expected behavior?
Also removed the
file_name
argument and attribute onBrain_Data
we weren't using it anywhere except as a default fallback argument to.write()
if the user didn't pass anything in (which they always should). It was causing problems with h5 files if that attribute didn't exist