Open alihaydaroglu opened 4 months ago
This sounds like a good idea. I'm already working on this so I'll structure it as you suggested.
Proposal: an IO class that is responsible for all data-loading and preprocessing (hiding job-specific implementation details so the "user" only needs to think about how to load data when they set the parameters of their job object).
I created a skeleton (super barebones, just a sketch of what I envision) for this class in suite3d/io.py. Here's a copy for easy reading.
Let me know what you think, I'll get started implementing tomorrow.
class IO:
"""
A class that handles all operations related to data loading and preprocessing for suite3d.
This class makes data-loading and preprocessing easy. You just have to set the parameters
of the job class that it takes as input, then this class will handle all the rest. The goal
is to make the interface of data-loading as straightforward as possible, with minimal need
to consider implementation details related to the particular requirements of the job.
An instance of this class can be passed as an argument to other modules to make sure data
loading is handled in a consistent and easy-to-use way wherever relevant.
"""
def __init__(self, job):
# By containing a reference to the job object, we can easily access all relevant
# parameters related to data-loading without having to remember all the kwargs.
self.job = job
def _update_prms(self, **parameters):
"""
A helper function that updates the job parameters with the provided parameters.
"""
use_params = self.job.params.copy() # get default parameters
use_params.update(parameters) # update with any provided in kwargs
return use_params # return the parameters intended for use right now
def load_tifs(self, paths, planes, **parameters):
"""
A central mechanism for loading tif files. This function is meant to be called
every single time tifs are ever loaded, and it will handle all the necessary
steps to ensure that the tifs are loaded correctly depending on the job's
parameters (lbm or not), any "local" parameters (like the ones typically passed
as kwargs to ``lbmio.load_and_stitch_tifs()``), and anything else that is needed.
"""
# example use of _update_prms to get the parameters to use for this call
params = self._update_prms(**parameters)
def load_accessory_data(self, *args, **kwargs):
"""
Sometimes information related to IO needs to be loaded that is pulled from extra
outputs of the ``lbmio.load_and_stitch_tifs()`` function. This function (and maybe
a few others) is/are meant to be called in place of those for efficient processing
and clear responsibility.
For example, ``init_pass.run_init_pass()`` calls the following line:
__, xs = lbmio.load_and_stitch_full_tif_mp(..., get_roi_start_pix=True, ...)
Which is used to retrieve just "xs"=the starting offsets for each imaging ROI.
This can be replaced with a new "load accesory data" function that is called
for this purpose -- depending on preference we can either have it run processing
again or potentially cache certain results that are known to be needed during an
initial call to ``load_tifs()``.
"""
def all_supporting_functions(self, *args, **kwargs):
"""
A collection of functions that are used to support the main functions of this class.
These functions are meant to be called by the main functions, and should not be called
directly by the user.
Examples:
load_and_stitch_full_tif_mp
stitch_rois_fast
etc...
"""
I really like this, I think this should satisfy what we need. The accessory data is a good point! There are also a few places where SI parameters like frame rate are accessed, those can be incorporated into here.
No part of the code (that I can find) sets concat=False
in the call to load_and_stitch_tifs
.
Actually, there's only one place, here in init_pass
, but it immediately concatenates them and the nonconcatenation dependent variable mov_lens
is not used anywhere.
Let me know if I can just get rid of the concat
kwarg if it isn't important for something that I'm missing.
Yes, removing should be fine.
instead of being tied to lbmio, we should have a standard function (e.g. something at the level of load_and_stitch_tifs) that can call arbitrary I/O functions based on a paramter. One possibility is to have a generic
io.py
file that has a function like load_and_stitch_tifs, which in turn has the option to calllbmio.py
functions or for examplestandardio.py
which contain I/O functions for typical 2p microscopy