AllenCellModeling / aicsimageio

Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Python
https://allencellmodeling.github.io/aicsimageio
Other
197 stars 51 forks source link

File Format Questions + General 4.1 Planning #175

Closed evamaxfield closed 1 year ago

evamaxfield commented 3 years ago
evamaxfield commented 3 years ago

See #174 for more info.

This should be like scene 1 has shape (1, 2, 3, 4, 5) and scene 2 has shape (2, 3, 4, 5, 6)

evamaxfield commented 3 years ago

Hey @sebi06 hoping I can pick your thoughts / knowledge of CZI files.

Is it possible to have a multi-scene pyramid CZI file? And if so what dimensions names are attached?

My mental model of a pyramid file was that each pyramid level basically acted as a different scene but that may be incorrect.

evamaxfield commented 3 years ago

Can bfconvert the existing RGB CZI from aicspylibczi test resources

zeissmicroscopy commented 3 years ago

Hi @JacksonMaxfield

I know that the scene concept is not really "super-easy" .... :-)

The pyramid concept is something else for use. Let's focus on the 1st scene (S=0) from the example above:

Sidenote: When opening a CZI via BioFormats, BF (because of the limitations of OME-TIFF) somewhat "interprets" the pyramid levels of a individual scene as "BF Series"

Regarding the bfconvert question, I am not sure. But BF can read RGB CZIs with scenes for sure. But usually it gives you a 3CH image afterwards. So I would assume, yes this works.

What type of images with which dimensions do you need? I can get you basically everything with my "simulated microscope" or dig a bit insider our image storage. I am happy to help here.

Feel free to contact me or even ask for a call, if you need info about the CZI (sebastian.rhode@zeiss.com)

sbesson commented 3 years ago

My mental model of a pyramid file was that each pyramid level basically acted as a different scene but that may be incorrect.

Seconding @zeissmicroscopy's https://github.com/AllenCellModeling/aicsimageio/issues/175#issuecomment-765418806, I think the concept of scenes/positions should be kept orthogonal to the concept of resolutions/pyramidal levels.

Using another public example (and another file format) from the Whole Slide Imaging domain, this public Leica SCN file from the OpenSlide project contains mutiple whole slide images acquired at separate positions where each position is itself a pyramid.

Sidenote: When opening a CZI via BioFormats, BF (because of the limitations of OME-TIFF) somewhat "interprets" the pyramid levels of a individual scene as "BF Series"

I think this will happen when the Bio-Formats API is set to flatten the resolutions. This was introduced during the early work on the multi-resolution and is still the default API mostly because changing it requires a breaking version increment. Calling setFlattenedResolutions(false) should allow the reader to do the right thing i.e. map each scene as a Bio-Formats series and each pyramidal level as a resolution.

Is it possible to have a multi-scene pyramid CZI file? And if so what dimensions names are attached?

I have been quickly browsing through the public data available in IDR as several lightsheet studies have been submitted as Zeiss CZI files e.g. idr0038 or idr0077. However it looks like none of these are are both multi-position and pyramidal CZI files.

@zeissmicroscopy can I use the occasion to clarify a scenario in the use case of lightsheet data? In the example of https://idr.openmicroscopy.org/webclient/?show=image-9836847, the CZI file contains acquisitions at two different orientations of the sample. So these images are related to each other via a 3D transformation which includes a rotation. Are these also be included in the scenes in the CZI vocabulary or is this another concept?

evamaxfield commented 3 years ago

Thanks for the replies!

I think the concept of scenes/positions should be kept orthogonal to the concept of resolutions/pyramidal levels.

After this discussion, I totally agree. Now it will just be about how to expose these to the readers and writers of aicsimageio.... We have already planned to talk about it in our next dev meeting :joy: My immediate thought would be that we mirror the planned 4.0 API and make level stateful to the image container, i.e.:

img.levels  # returns tuple of level ids specific to active scene
img.set_level("2x")  # update current level for active scene
img.current_level  # returns current level for active scene

Defaulting to level 0?

img.set_scene("Image:0")
img.levels  # levels for scene 0
img.set_scene("Image:1")
img.levels  # levels for scene 1

Fortunately, considering a proposal like this isn't a breaking change we could simply introduce this function in 4.1 instead of holding off on 4.0 even longer.

Calling setFlattenedResolutions(false) should allow the reader to do the right thing i.e. map each scene as a Bio-Formats series and each pyramidal level as a resolution.

Do you happen to know if this command is available in the CLI bfconvert tool? My Java is a bit rusty...

the CZI file contains acquisitions at two different orientations of the sample. So these images are related to each other via a 3D transformation which includes a rotation.

I will let @zeissmicroscopy answer but my first thought would that that is the "Rotation" dimension that we have encountered in aicspylibczi

sbesson commented 3 years ago

Hi @JacksonMaxfield

Fortunately, considering a proposal like this isn't a breaking change we could simply introduce this function in 4.1 instead of holding off on 4.0 even longer.

The resolution API as you drafted it above largely mirrors the way it is implemented in Bio-Formats - see IPyramidHandler so it makes sense to me. Do you have a rough idea for the ideal 4.0.0 timeline?

Do you happen to know if this command is available in the CLI bfconvert tool? My Java is a bit rusty...

Yes bfconvert -noflat should ensure the resolution levels are not flattened during the conversion.

evamaxfield commented 3 years ago

The resolution API as you drafted it above largely mirrors the way it is implemented in Bio-Formats - see IPyramidHandler so it makes sense to me.

Well that is wonderful to see. Will work with @AllenCellModeling/aicsimageio-core-devs to see if we need to do anything more than that (with Reader changes accordingly).

Yes bfconvert -noflat should ensure the resolution levels are not flattened during the conversion.

Thank you so much! Will give this a quick go over the weekend just to get a feel for the data.

Do you have a rough idea for the ideal 4.0.0 timeline?

Oof. Well, I was originally planning and still aiming for the end of this month, ~Feb 2 would be ideal imo but likely looking like mid Feb. I really don't know as all of this is in my free time since I no longer work at AICS :sweat_smile:

A long and probably too-much-info version full rundown of what's left on the todo list:

Cutting dev / beta releases:

This is also just a call to anyone reading this issue that if you want to see aicsimageio==4.0.0 faster, any help on the above list would be greatly appreciated. The active branch is feature/update-to-4.0-api :pray: .

nimne commented 3 years ago

I'll push a 0.4.0 of readlif this weekend (next day or two) - finishing up a few big projects today!

sebi06 commented 3 years ago

Hi all,

to my knowledge we use the following inside our CZI format:

Information CZI Dimension Characters:

- '0': 'Sample'  # e.g. RGBA
- 'X': 'Width'
- 'Y': 'Height'
- 'C': 'Channel'
- 'Z': 'Slice'
- 'T': 'Time'
- 'R': 'Rotation'
- 'S': 'Scene' # contiguous regions of interest in a mosaic image (TileRegion)
- 'I': 'Illumination' # direction
- 'B': 'Block' # acquisition block form Multiblock experimente´s
- 'M': 'Mosaic' # index of single tiles from a TileRegion
- 'H': 'Phase' # e.g. Airy detector fibers
- 'V': 'View' # e.g. for SPIM

And this is orthogonal with the actual pyramid levels in case of larger Tileregions. I am just collecting some test CZI image. Would it be OK to share them with you via Dropbox?

evamaxfield commented 3 years ago

I am just collecting some test CZI image. Would it be OK to share them with you via Dropbox?

Totally fine and would be much appreciated!

nimne commented 3 years ago

@JacksonMaxfield reaflif 0.4.0 with mosaic support is up on pypi! As far as I can tell, it works as expected. There are no image stitching utilities, but data access works.

evamaxfield commented 3 years ago

Thanks @nimne! I will get started on LifReader as soon as possible. I think because we at this point have (or plan to support) multiple readers that can return mosaic images, I may just write a utility function to stitch tiles together in this repo / pull it from aicspylibczi so no worries there!

Thanks again. Really appreciate the work you put in :slightly_smiling_face:

sebi06 commented 3 years ago

Hi @JacksonMaxfield

here come the link to some CZI files:

https://www.dropbox.com/sh/1sbhwrnwsh6umf6/AADvad9sxC1luU70aC3t0a6na?dl=0

Please let me know, when you downloaded what you need.

evamaxfield commented 3 years ago

Thanks @sebi06 I got the files all downloaded!

sebi06 commented 3 years ago

Cool. I also recommend to get the latest ZEN 3.3 (free download) to be able to see the CZI in their natural environment... 😉

evamaxfield commented 3 years ago

A couple of updates and a couple of questions for everyone:

Updates

We have tentatively set a release schedule for 4.0! :tada: Full details can be found in the meeting minutes of #188, but in short, we aim to have 4.0 released ~ Feb 16.

We have agreed that handling pyramid levels makes sense and we want to do it with the API described above. Unfortunately, we will probably push it back to 4.1. So in 4.0 you woulld likely be able to read pyramid files but the data will just be level=0.

Questions

Hey @sebi06 we got some more questions (I'm so sorry)...

For our 4.0 API, one of the goals is to stitch back together the mosaic image prior to returning the data (See #163, #174). For large files, this is probably really bad, but this is where dask could come in handy with each tile of the mosaic being a single chunk of the array. Now the question: For mosaic images, we know that it is possible to have Scenes and mosaic tiles in the same image, i.e.: S: 3, M: 40, Y: 200, X: 200 -> "Three Scenes, each composed of a mosaic image of 40, 200x200 YX Tiles". But is it common to split mosaic images across Scenes? Or is our planned 4.0 API of scene being stateful still work for this behavior. i.e.:

img = AICSImage("my-mosaic.czi")
img.set_scene("Scene:0")  # this the default but being explicit for the example
img.dask_data  # returns a dask array of the fully reconstructed / stitched mosaic image from scene 0
img.set_scene("Scene:1")  # change scene
img.dask_data  # returns a dask array of the fully reconstructed / stitched mosaic image from scene 1

Expanding, would anyone ever want to create a mosaic using data from both / all Scenes at once?

sebi06 commented 3 years ago

Hi @JacksonMaxfield ,

no reason to be sorry if you have question. It is part of my job to answer those :-). I still think I did not explain the difference between scenes and tile well enough (but maybe I am wrong).

"... we know that it is possible to have Scenes and mosaic tiles in the same image ..."

If one created an image acquisition with a single position (1x1 tile) or any TileRegion, there will be automatically one scene. There is no way not to acquire scene, as soon as you have tiles etc. And yes, if one can have multiple scenes with arbitrary size inside one CZI. This is actually a quite common use case especially for slides, where people put several tissue sections one slide (TileRgeion of arbitrary size) and image all of them in on run. This will yield in a CZI with several scenes with different sizes.

I like the idea of using Dask for the tiles a lot. But maybe it is better to use

img.set_scene("0") or img.set_scene(0) because it might be easier to loop over those scenes without the need for a "complicated" string as an identifier?

"Expanding, would anyone ever want to create a mosaic using data from both / all Scenes at once?"

I would argue that this is a very unlikely use case and therefore I would drop this for now because the added value for a user is not big enough and it makes things complicated. In our ZEN software we have a slider for scenes, which allows to display individual scenes and cycle trough them. This slider can be deactivated and then ZEN shows all scenes in one image (one CZI) with the correct scale. But this view is just useful to get an overview and depending on the spatial dimensions not even useful. It is mainly useful for tissue slides, where there is a lot of "image" and little "empty space" between scenes.

But for example for wellplate this view is hard uses. Imagine one acquires a single image in every well. In this case it might look like below.

Therefore I think it is fine to not implement something that allows to combine all scenes at once in one array, especially not for the 1st release. Let's see what the feedback is, but also in my team we try to follow the rule of not implementing things because they might be needed ... :-) (unless it is crystal clear, that this will be required)

image

image

evamaxfield commented 3 years ago

I would argue that this is a very unlikely use case and therefore I would drop this for now because the added value for a user is not big enough and it makes things complicated. In our ZEN software we have a slider for scenes, which allows to display individual scenes and cycle trough them.

This was exactly the answer I was hoping for. I think this basically sets us up nicely for mosaic support then :smile:

Therefore I think it is fine to not implement something that allows to combine all scenes at once in one array, especially not for the 1st release.

This is basically our idea too. If possible, we would like to generalize our stitching function but at the bare minimum just stitch one scene as that has consistent shape and dtype at least!


Answering your comment:

I like the idea of using Dask for the tiles a lot. But maybe it is better to use

img.set_scene("0") or img.set_scene(0) because it might be easier to loop over those scenes without the need for a "complicated" string as an identifier?

Ahhh don't worry. We thought about that as well. We think going off of metadata named values for scenes makes the most sense because it allows for named specificity and not just integer indices. But, to solve this you could do something like:

for scene_id in img.scenes:
   img.set_scene(scene_id)
   img.dask_data  # returns mosaic for this scene
sebi06 commented 3 years ago

Perfect. This looks cool to me. Once all this works we can collect feedback and see what makes sense. Obviously I already have a lot of ideas else would be useful. And in order to set the focus I think there a two clear use cases that can provide guidance.

  1. The user has a large multidimensional image and tools like Napari etc. need a way to "lazy" load the chunk of data that is need to display and browse the data.
  2. The user want to cycle or loop trough various dimensions to read a "subset" of the (large) image in order to analyze / process it

One thing we should at least think of is the option to read a specific ROI / Tile of a dedicated scene, because one scene itself might be still very big.

evamaxfield commented 3 years ago

One thing we should at least think of is the option to read a specific ROI / Tile of a dedicated scene, because one scene itself might be still very big.

Still possible. I didn't give you the full spec that we had thought of. On the Reader classes, CziReader for example, would still have an M dimension. On the AICSImage object it would be merged. So if you really wanted just individual tiles you could simply go to the Reader. We think that should work but not entirely sure how optimized we can make it speed / reading time wise. So will keep you updated.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.