catalystneuro / roiextractors

Python-based module for extracting from, converting between, and handling optical imaging data from several file formats. Inspired by SpikeInterface.
https://roiextractors.readthedocs.io/en/latest/index.html
BSD 3-Clause "New" or "Revised" License
12 stars 7 forks source link

Unify Volumetric/MultiPlane Imaging API #245

Closed pauladkisson closed 1 year ago

pauladkisson commented 1 year ago

Current Behavior

Right now, the only imaging extractor that supports 3D imaging is BrukerTiff. It does so by taking a list of SinglePlaneImagingExtractors and combines them a la MultiImagingExtractor. Note that each SinglePlaneImagingExtractor is itself a MultiImagingExtractor that combines multiple 'chunked' files corresponding to a single depth plane. Each SinglePlaneImagingExtractor instance only supports 1 channel and 1 plane, which are passed via stream_name during the __init__. Each MultiPlaneImagingExtractor instance supports 1 channel (but all planes) which is passed via stream_name during the __init__.

Issues

Proposed Solutions

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

pauladkisson commented 1 year ago

@CodyCBakerPhD @h-mayorquin @weiglszonja @alessandratrapani Please let me know what you all think!

CodyCBakerPhD commented 1 year ago

@pauladkisson Thanks for writing this all up, I'll look deeper and leave more comments over next couple days

BrukerTiff specifies channel during the init but all other IE's specify channel during the call to get_video. Not sure which is better, but we should be consistent.

Strongly against having it in the get_video call because it leads to the 4D ambiguity; much clearer to have separate objects for each channel if you want to do anything with them side-by-side, which makes it a field specified at __init__

So totally on board with that solution

This will require refactoring of all the existing IE's

Yes and no - their code will have to be changed, but it was also code that was never tested or used perhaps outside of the .sbx case? Which we've also not seen used in quite a while

It also simplifies get_video to only take start_frame and end_frame, which I think is a good thing.

👍

'streams' are a low-cohesion concept to represent optical channels and depth planes.

Yeah, been talking with @h-mayorquin about this lately. It's purposely ambiguous to also allow any other type of future formatting distinction we might need to add while remainig universal across the API

It's also flatter and simpler, though what it really does is shift the complexity of structure to the string itself, but as long as we have a convenient fetcher get_stream_names that is less complicated than breaking the approach into separate calls per plane/channel option.

Multiple optic channels found? It shows up as a series of structured strings. Multiple planes found? They show up as another part of the string structure. Want to initialize an extractor? Just specify a single string that matches one of the selections.

Specify channel and plane separately by index (or ID?)

Strongly against the ID/index route. Neo supports both at the same time and the ID is the annoying one, the bane of many a stream index issue. Strongly recommend using a human-readable identifier. Less strongly against splitting stream_name into channel_name + plane_name if others think that is worth it compared to my 'single argument is simpler' point above

h-mayorquin commented 1 year ago

I agree with @pauladkisson that we should break streams. As a note, Streams are not about different types of preprocessing. It is basically stuff that can be processed together in the context of spikeitnerface and raw.io in neo (same sampling_rate, start and duration). The concept is not documented but you can find some details here.

How to break depth, channel and whatever else might come?

As for this:

Combining multiple SinglePlaneImagingExtractors with MultiImagingExtractor is a bit hacky and confusing. MultiImagingExtractor was designed to concatenate ImagingExtractors along the frames axis, but here we want to combine multiple IEs along a separate depth axis.

This is not the original intention: https://github.com/catalystneuro/roiextractors/issues/11 But I think it was changed here: https://github.com/catalystneuro/roiextractors/pull/127

I agree that we are conflating a tool to concatenate data extractors in time with some internal machinery to assemble extractors from channels, planes or sub-extractors in time. We came to need the latter because some formats provide multiple files and we judged (corrrectly?) that composing in this way is easier than ad-hoc parsing. I think it makes sense separting assembling in time, channels and depth.

The analgous in spikeinterface are concatenation and channel agregation.

Using MultiImagingExtractor for this purpose leads to wonky stuff like start_frames = [0, 0, ..., 0].

Can you point out to where does this show up and also add other wonky stuff / difficulties? I feel this point should be more descriptive of the existing problems.

I will add something about the third point later.

weiglszonja commented 1 year ago

@h-mayorquin

Using MultiImagingExtractor for this purpose leads to wonky stuff like start_frames = [0, 0, ..., 0].

Can you point out to where does this show up and also add other wonky stuff / difficulties? I feel this point should be more descriptive of the existing problems.

This is from the BrukerTiffMultiPlaneImagingExtractor which uses MultiImagingExtractor to concat multiple BrukerTiffSinglePlaneImagingExtractor over depth

https://github.com/catalystneuro/roiextractors/blob/31bb80ae4eee9be0f37f1f6a14b166582afdbac7/src/roiextractors/extractors/tiffimagingextractors/brukertiffimagingextractor.py#L218-L219

h-mayorquin commented 1 year ago

Thanks for the pointer @weiglszonja. I still would like to see more confusions that arise from this use. This is the only case that we have for Paul's second point.

pauladkisson commented 1 year ago

Notes from Barcelona 2023: