ome / ngff

Next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.
https://ngff.openmicroscopy.org
Other
110 stars 38 forks source link

Single-scale images #207

Open clbarnes opened 11 months ago

clbarnes commented 11 months ago

Images are not always scale pyramids, but single-image arrays also benefit from some of the metadata we apply to multiscale groups. I suppose these are implicitly supported in the spec ("just add a coordinateTransformations and axes specified elsewhere") but IMO it would be useful to define single-scale datasets as their own valid type. The use case for me is extracting a small ROI from a single scale level of a larger volume.

You can, of course, define a multiscale group with only one dataset in it, but then it becomes quite verbose, as well as having to bounce between the group and array metadata.

This is sort of the opposite of https://github.com/ome/ngff/issues/187 : that asks to define some image metadata in a group above the containing group, where this asks to define some image metadata on the array itself.

This probably falls under https://github.com/ome/ngff/issues/179 , although that is specifically about labels and how to store them in a hierarchy rather than the broader class of single-scale images.

I believe this converges with with proposed solution "c" in https://github.com/ome/ngff/issues/200 (also discussed in https://github.com/ome/ngff/issues/102 ) . This would also simplify the access API: it's the same whether you're accessing a single-scale array, or if you only care about a single scale of a multiscale group.

d-v-b commented 11 months ago

Ideally, a single resolution image would "just" be a multiscale image with one scale level. This would be nice for a lot of reasons -- OME-NGFF viewers will know how to deal with it, and if you later decide that you do need multiple levels of resolution, you can drop them in without reworking your metadata. You refer to a small ROI, but there's no guarantee that the next one is small :) .

But as you note (and as I have experienced), the current multiscale metadata spec is unwieldy, both because it is needlessly verbose (specifying coordinateTransformations at the top-level and per-dataset, when only per-dataset is needed) and because it keeps the spatial metadata for an image away from the image itself, which leads to parsers ping-ponging between the image you want to load and the group that defines how the interpret that image.

Purely hypothetically, consider the following minimized multiscale metadata scheme:

Group metadata:

ome: {
    multiscales: {
        metadata: {...}, # description of the multiscale pyramid 
        datasets: ["s0"] # name(s) of array(s) contained in this group
        }
}  

multiscales is drastically simplified. it no longer a list, and it just contains metadata and a list of strings which MUST be names of arrays contained in the group bearing the multiscales metadata.

Array metadata (e.g., for s0)

ome: {
    transforms: {
        axes: [...],
        scale: [...],
        translation: [...]
          }
}

array metadata contains the information required to embed the array in physical space. seems legit.

I'm namespacing all the ome-ngff stuff under a ome keyword, inspired by #206

if we had something this compact for multiscale metadata, would you still want a separate single-scale schema @clbarnes ?

clbarnes commented 11 months ago

That arrangement certainly looks good to me. I think, if anything, a "singlescale" spec would basically just be a name for the array-level metadata you've listed there. This would simplify specifying "multiscale" groups: rather than having to include the array-level metadata specification in that part of the spec, you'd just need to say "multiscale datasets are groups with X metadata, containing arrays with singlescale metadata with matching axes".