[feature] Reuse zarr metadata from one image for others in plate

manzt commented 3 years ago

Motivation:

Currently we initialize each ZarrArray in a plate layout independently. This means that we make row x col requests for .zarray metadata prior to fetching any unique pixel data. This can lead to a substantial overhead of just waiting for metadata to load for large plates; time that could be spent fetching the actual pixel data.

In addition, if we do end up supporting multiscale plates, we would need to make row x col x n_levels requests to initialize all arrays in the grid. For example, this plate from the OME-Blog: https://s3.embassy.ebi.ac.uk/idr/zarr/v0.1/plates/5966.zarr would require 394 * 5 = 1970 requests to initialize prior to fetching any data.

Proposal

Similar to reusing the omero metadata for the first image in the plate, we should be able to reuse the .zarray metadata for the first image in the plate to initialize all other ZarrArray objects at the same resolution. By taking this approach, we can initialize all the ZarrArray objects for a plate with the same number of requests regardless of plate size.

This approach assumes that the .zarray metadata is consistent for all arrays at the same resolution (shape, chunk, compressor, etc are all identical). Perhaps this is too strong of an assumption, and thoughts @joshmoore or @will-moore ?

will-moore commented 3 years ago

I like the idea. Makes sense and certainly most of the plates I've seen do have identical sized images in every Well. But it's probably too restrictive to say that ALL Plates must have all Images the same size. So we could maybe 'consolidate' that info into a parameter like "all_images_same_size" in the plate metadata? Maybe open an issue at https://github.com/ome/ngff ?

manzt commented 3 years ago

Sure, I'll look into opening an issue. It's worth noting that in order to render to the grid currently (for plate or well), we require identically sized images:

https://github.com/hms-dbmi/vizarr/blob/1ef216350580b3425468550d2c3473a566af6827/src/gridLayer.ts#L44-L55

And vizarr will throw an error otherwise. So this issue is suggesting that we make that assumption initially as well, when initializing arrays.

joshmoore commented 3 years ago

Maybe open an issue at https://github.com/ome/ngff ?

Agreed. I think we have something along the lines of "all dtypes for the multiresolutions will be of the same type" and we might could generalize that for "all items in a certain type of collection".

hms-dbmi / vizarr

[feature] Reuse zarr metadata from one image for others in plate #75

Motivation:

Proposal