hms-dbmi / viv

Library for multiscale visualization of high-resolution multiplexed bioimaging data on the web. Directly renders Zarr and OME-TIFF.
http://avivator.gehlenborglab.org
MIT License
282 stars 46 forks source link

OME-Zarr Support #290

Closed ilan-gold closed 11 months ago

ilan-gold commented 3 years ago

@manzt Perhaps you could add some comments here but it seems like we lack a utility function for making a zarr loader for OME-Zarr. If that is because the spec is not stable, I understand that - this issue will remain open until it is, then, and then we can close it once it is added. Otherwise, I think it would be good to support in Avivator/Viv as well as vizarr.

manzt commented 3 years ago

Couple of initial thoughts. There is not a 1:1 mapping of OME-Zarr to loader. OME-Zarr is a spec that describes how to organize many (zarr) arrays within a hierarchy with associated metadata.

The nodes of the hierarchy are all dense array data, but different groups of the hierarchy have different semantics based on the metadata that is present. For example, a HCS plate is a zarr group containing many multiscale arrays (each with nlevels arrays), so does opening an HCS group meaning creating n x m multiscale loaders and initializing n x m x nlevels ZarrArrays? It depends on the application, not viv. Whereas, if the group is just a single multicale image, then you'd create one multi-scale loader (open nlevels arrays).

We would need some type of reducer to inspect the zarr group and determine what part of OME-Zarr hierarchy it is. The issue is that metadata between formats (OME-TIFF/zarr) is so different, and ultimately doesn't even matter to the loaders. It matters to the application that manages the state for viv. I think before we can move forward, we should try to decouple metadata from the loaders. Loaders (IMO) should just be some in-memory representation of nD arrays (including multi-scale).

ilan-gold commented 3 years ago

Hmm ok, it sounds then like this issue is pretty deeply linked to the coming development work from https://github.com/hms-dbmi/viv/issues/287.

We'll have to do some planning once modelMatrix is incorporated into deck.gl's TileLayer (soon, hopefully!)

manzt commented 3 years ago

TL;DR -- the more we add to viv, the more we are responsible to maintain for library users. Open arrays and determining initial rendering data is very application dependent (see, vitessce -- raster.json, avivator -- OME-XML, and vizarr -- .zattrs for different groups).

I'd rather make it super straight forward how to create loaders and what layers use those loaders.

ilan-gold commented 3 years ago

I don't think I was clear - I was relating this specifically to the push to support HCSLayer which is OME-Zarr, but is also a lot more than just that. It seems like creating a HCSLayer would then necessitate some sort of OME-Zarr loader-utility in Viv since the HCS Zarr storage format implements the OME-Zarr spec. Or no, am I misunderstanding? We don't have to make a loader in general for OME-Zarr, but I think we need to reconcile these two things since we won't have a great way to test HCSLayer without some sort of loader for it.

joshmoore commented 3 years ago

https://github.com/hms-dbmi/viv/issues/290#issuecomment-718886080 OME-Zarr is a spec that describes how to organize many (zarr) arrays within a hierarchy with associated metadata.

Is the same not true of OME-XML/OME-TIFF modulo "list of arrays" rather than "hierarchy of arrays"?

ilan-gold commented 3 years ago

@joshmoore What is the "list of arrays" in the context of OME-TIFF? I thiink to Trevor's point, the flexibility/ease-of-use of zarr has exposed more complicated decisions to developers as he mentioned re: multiscale groups, but I was under the impression that the same thing was in theory possible with a list of OME-TIFF files (i.e Zarr is not necessarily more flexible than TIFF under the current OME model, but just easier to use). Perhaps I am missing the mark of the discussion here, though, in which case I can step back :) Still excited about full OME support in Viv and happy to do what I can to help!

joshmoore commented 3 years ago

:) Excited as well, hopefully I'm not missing the point above. From my perspective, there's not too much new in the OME-Zarr spec beyond OME-TIFF. Each OME-TIFF can have an arbitrary number of images (equal to the "series count") and they can have metadata that forms them into more complex structures, e.g. an HCS plate. So I agree with your comments and that is what I was trying to get across, that the two are quite similar. @manzt additionally said that the issue is that the formats were so different, which may be the real breaking point.

ilan-gold commented 3 years ago

Gotcha @joshmoore - that's part of the reason I opened #336, to start to bring some harmony and cross-format structure to the loading process.

manzt commented 3 years ago

Is the same not true of OME-XML/OME-TIFF modulo "list of arrays" rather than "hierarchy of arrays"?

Totally true. I think the primary tension/issue is that the OMETIFFLoader and the ZarrLoader wrap two very different objects, that makes extending them 1:1 difficult (impractical?). The comments here expose what I believe to be a mistake in our abstraction/design, and I hope to work towards a suitable solution. Let me elaborate.

The OMETIFFLoader wraps a JS object that represents an entire OME-TIFF, and the ZarrLoader wraps a JS object that represents a single nD-Array – ZarrArray – or list of nD-Arrays if multiscale. This difference, seemingly subtle, has the implication that the ZarrLoader is "unaware" of the context is it instantiated from (e.g. which node we are at in the hierarchy) as well as any metadata for rendering.

The ZarrLoader only provides an interface to fetch/retrieve chunks of data from array nodes, whereas the OMETIFFLoader encapsulates metadata and many arrays.

This difference has lead to issues where users expect an omexml property on the ZarrLoader(#325), which doesn't make sense given difference sources. What I'd like to avoid is end-users relying on the loaders for anything other than retrieving pixel data (think of loaders like lazy numpy arrays – no axis labels, rendering metadata, etc). It is then up to the application using Viv (e.g. Avivator, Vizarr, Vitessce) to tell Viv how to render this pixel data using props (whether that rendering info is derived from the source or inferred).

TL;DR – My preference is to make a loader just a pixel data source. I'd like to think of a composition-driven API that separates the pixel data from associated metadata that might be in different forms (OME-XML, OME-Zarr .zattrs, etc). In the end, each Viv Layer corresponds to a single nD-array (or multiscale arrays). This way, we avoid creating a new loader for every different data source. Viv doesn't "care" that it's OME-Zarr vs MyCustomZarr, the differences are how metadata are laid out, but the pixel data are always just dense Zarr Arrays (regardless of the Zarr variant).

This way we can collaborate on creating something akin to the ome-zarr-py plugin for napari, that takes a string URL as input and returns data (a pixel source) and all known metadata for the group.

joshmoore commented 3 years ago

... or list of nD-Arrays if multiscale

This reminds me of napari where the the base contract for a plugin/loader is to always return a list of layers where each layer can also be pyramidal (Personally I'd default to [[arr]] even if there's only one pyramidal level to keep things simple)

My preference is to make a loader just a pixel data source.

:+1:

I'd like to think of a composition-driven API that separates the pixel data from associated metadata that might be in different forms (OME-XML, OME-Zarr .zattrs, etc)

At least for the OME-variants, I think it'd be fair to say we are working toward a situation where you could assume the metadata from the .zattrs (i.e OME-"JSON") would be a superset of OME-XML and we could convert the latter to the former for the handling of OME-TIFF.

This way we can collaborate on creating something akin to the ome-zarr-py plugin for napari,

❤️ Happy to get into an ome-zarr-js or whatever it would be called.