Open peterhollender opened 4 months ago
I'm actually not too familiar with the Xarray library, but from a cursory look it looks like xarray.DataArray
is a thin wrapper around a numpy array, which is a very friendly and standard format to work with for python users. I like the xarray.DataArray
option. I agree that it doesn't make sense to have the underlying data structure that openlifu operates on be a vtk structure, that is more for visualization. And since OpenLIFU + visualization is mainly going to be achieved via the SlicerOpenLIFU extension, the visualizable representation of volumes is going to be a Slicer volume mrml node, which internally handles the vtk stuff. OpenLIFU-python can focus more on making data structures user friendly for algorithm development rather than for visualization.
Loading to xarray.DataArray
rather than than keeping the nibabel.Nifti1Image
also can make it easier to extend the IO aspects of openlifu later so that possibly multiple volume file formats can be handled. When developing the file loading code, it would be good to keep that I/O code cordoned off in a separate module as much as possible.
I added the method, but I'm not sure I like the way I did it. There is currently a static method in openlifu.db.Database
called _load_nifti
that reads a nii, accepts an optional metadata dict, and creates the xarray.DataArray
. When called from openlifu.db.Database.load_volume
, it handles retrieving that metadata from the associated .json
file and creating the volume correctly, but it seems like openlifu.db.Database._load_nifti
is not the correct home for that method - it should just be load_nifti
and located somewhere else in the hierarchy.
Yes maybe it belongs in openlifu.io
?
Adding my two cents here.
I see two advantages on keeping a volume as a native Nifti
(or even np.ndarray
) type (compare to xarray
):
volume
is used and processed is in setup_sim_scene
from https://github.com/OpenwaterHealth/OpenLIFU-python/issues/126. When looking at the pre-processing steps, we could re-use existing codebase for pre-processing (denoising, segmentation ... from neuroscience community). For example the new neuroscience data standard BIDS uses nifti with associated metadata in json.xarray
is if you have non-uniform grids and multiple data arrays, which is not the case here for a simple volume (uniform grid and only voxel intensities). The grid can be defined implicitly using the affine and spacings. Also I/O from Nifti
should be faster and compression is more efficient (.nii.gz).
We decided to store volumetric data in the NIfTI1 format, because the volumes can store spatial orientation data (as a transform), and be loaded natively into slicer. I'm not quite sure how to implement this into the
Database
class, though. In MATLAB, I created aVolume
class that was essentially anxarray.DataArray
- it has voxel data with units, named coordinate axes with values and units, a transform matrix, and a flexible attribute dictionary. I think we can probably usexarray.DataArray
here as well, althoughnibabel.Nifti1Image
, which is whatnibabel.nifti1.load
returns, may be suitable on it's down (although I'm not as comfortable operating on the underlying data). I know vtk has a number of volumetric data formats which I'm guessing will be used in visualization, as well, although again - ease of use for processing is also a factor. I'm probably going to try to load from.nii
to anxarray.DataArray
, but I welcome other suggestions