Multi-Plane and Multi-Channel Support for ScanImage

catalystneuro / roiextractors

Python-based module for extracting from, converting between, and handling optical imaging data from several file formats. Inspired by SpikeInterface.

https://roiextractors.readthedocs.io/en/latest/index.html

BSD 3-Clause "New" or "Revised" License

12 stars 7 forks source link

Multi-Plane and Multi-Channel Support for ScanImage #241

Closed pauladkisson closed 1 year ago

pauladkisson commented 1 year ago

Fixes #240.

h-mayorquin commented 1 year ago

So for Bruker we have an extractor for the single channel - single plane, then another for single plane and then one that gets everything. But here, you are changing the single extractor that we have and adding al the functionality, right?

Is this because the tiff files sometimes come with only (time, rows columns) but other times come with everything included (time, row, columns, depth, channel)? Looking at your code it seems that this still only reads one file so probably yes?

pauladkisson commented 1 year ago

So for Bruker we have an extractor for the single channel - single plane, then another for single plane and then one that gets everything. But here, you are changing the single extractor that we have and adding al the functionality, right?

Yes, that is how it's currently implemented. But, I am still playing around with the structure to see how it can best represent the data (still in draft). I am still planning on incorporating the concept of streams, but also toying around with the idea of a base MultiPlaneImagingExtractor to unify the API and define some volumetric-specific fields.

Is this because the tiff files sometimes come with only (time, rows columns) but other times come with everything included (time, row, columns, depth, channel)? Looking at your code it seems that this still only reads one file so probably yes?

The .tiff files are always structured (time, rows, columns) BUT if they contain multi-plane and/or multi-channel data, they cycle round-robin in the 'time' dimension ex.

[channel_1_plane_1_frame_1, channel_2_plane_1_frame_1, ..., channel_C_plane_1_frame_1,
channel_1_plane_2_frame_1, ..., channel_C_plane_2_frame_1, ..., channel_C_plane_P_frame_1,
channel_1_plane_1_frame_2, ..., channel_C_plane_P_frame_F]

Whether or not they contain multiple planes or multiple channels is fully specified by the file metadata, so I figured I could read everything with one class.

h-mayorquin commented 1 year ago

Thanks for your explanation, @pauladkisson.

pauladkisson commented 1 year ago

@CodyCBakerPhD, This is ready for some feedback.

I plan on moving the MultiPlaneImagingExtractor to a different file and a different PR, but I included it to give a sense of how I see the ScanImageExtractors working with the inheritance structure.

I also played around with incorporating support for the old ScanImageTiffImagingExtractor, but the metadata for the file we have to test is completely different from the current metadata (the testing file is pretty old). So, I think the cleanest way to deal with it is just keep it around as a legacy version and develop new tests for the new ScanImage stuff.

Last but not least, I still need to set up a testing structure, but I figured it would be good to get some feedback first.

CodyCBakerPhD commented 1 year ago

First pass looks good!

I'll dig in deeper over next few days while you add tests

I also played around with incorporating support for the old ScanImageTiffImagingExtractor, but the metadata for the file we have to test is completely different from the current metadata (the testing file is pretty old). So, I think the cleanest way to deal with it is just keep it around as a legacy version and develop new tests for the new ScanImage stuff.

Sounds good for retiring the class itself - can you add a DeprecationWarning then with a comment #TODO: remove on or after <three months from now> for a smooth cycle on that?

That said, supporting multiple versions of metadata is actually something we should generally strive to support for any example files we have on the testing repo - you never know what version people are using on older (and still valuable) data or rigs that just haven't been updated. Perhaps a routing function to multiple metadata extraction methods could lessen the complexity of that step? Granted, ROIExtractors doesn't concern itself too much with metadata, more of a problem for NeuroConv side

pauladkisson commented 1 year ago

Sounds good for retiring the class itself - can you add a DeprecationWarning then with a comment #TODO: remove on or after <three months from now> for a smooth cycle on that?

Done!

That said, supporting multiple versions of metadata is actually something we should generally strive to support for any example files we have on the testing repo - you never know what version people are using on older (and still valuable) data or rigs that just haven't been updated. Perhaps a routing function to multiple metadata extraction methods could lessen the complexity of that step?

I also added a legacy metadata parsing function for the old metadata but it is less comprehensive and isn't actually used by the legacy version of ScanImageTiffImagingExtractor

pauladkisson commented 1 year ago

Tests are failing until the new scanimage test file gets added to the gin repo -- See Issue # 16

CodyCBakerPhD commented 1 year ago

@pauladkisson There are some other volumetric/multi-channel example data available on the GIN - would it be possible to use any of those?

https://gin.g-node.org/CatalystNeuro/ophys_testing_data/src/main/imaging_datasets/ScanImage

CodyCBakerPhD commented 1 year ago

I also added a legacy metadata parsing function for the old metadata but it is less comprehensive and isn't actually used by the legacy version of ScanImageTiffImagingExtractor

Right, I suppose that would be more of a thing for the NeuroConv side

pauladkisson commented 1 year ago

There are some other volumetric/multi-channel example data available on the GIN - would it be possible to use any of those?

Yes and no. I am using those for some of the tests. But they don't have enough frames (only 1 or 2/plane) to really check get_frames for example. Also, they are all 1-channel. So, a lot of the tests that I wrote only use the adesnik data bc it's more suitable for checking multiple conditions.

pauladkisson commented 1 year ago

@CodyCBakerPhD See gin PR #18

codecov[bot] commented 1 year ago

Codecov Report

Merging #241 (1289b70) into volumetric (2dbf0f8) will increase coverage by 0.85%. Report is 18 commits behind head on volumetric. The diff coverage is 87.80%.

@@              Coverage Diff               @@
##           volumetric     #241      +/-   ##
==============================================
+ Coverage       76.98%   77.83%   +0.85%     
==============================================
  Files              37       38       +1     
  Lines            2724     2964     +240     
==============================================
+ Hits             2097     2307     +210     
- Misses            627      657      +30

Flag	Coverage Δ
unittests	`77.83% <87.80%> (+0.85%)`	:arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
src/roiextractors/extractorlist.py	`100.00% <ø> (ø)`
...ctors/extractors/tiffimagingextractors/__init__.py	`100.00% <100.00%> (ø)`
...ctors/tiffimagingextractors/scanimagetiff_utils.py	`98.21% <98.21%> (ø)`
...imagingextractors/scanimagetiffimagingextractor.py	`87.60% <84.65%> (-10.68%)`	:arrow_down:

pauladkisson commented 1 year ago

Continued on #253.