AllenInstitute / ophys_etl_pipelines

Pipelines and modules for processing optical physiology data
Other
9 stars 5 forks source link

Speed up mesoscope TIFF splitting #473

Open danielsf opened 2 years ago

danielsf commented 2 years ago

The new mesoscope TIFF splitting code merged in #464 (ticket #460) appears to be a factor of ~ 2 slower than the legacy code. This is probably not a show-stopper, but it would be nice to speed it up.

The likeliest way to speed it up would be to refactor the TimeSeriesSplitter class defined here

https://github.com/AllenInstitute/ophys_etl_pipelines/blob/main/src/ophys_etl/modules/mesoscope_splitting/tiff_splitter.py#L372

so that, instead of generating the HDF5 file for one (ROI, z) pair at a time, it can just loop through all of the (ROI, z) pairs associated with the ophys_session, writing all of the expected videos at once. The advantage is that, instead of having to loop through the time series TIFF file once for each video (i.e. for each call of write_output_file), it could loop through the TIFF file once, writing frames to the appropriate video files.

Tasks

Validation

danielsf commented 2 years ago

this might also be a good opportunity to experiment with the HDF5 compression options mentioned in #389