some thoughts - Githubissues

sofroniewn commented 4 years ago

Hi @thewtex, I couldn't leave this on your branch, so am putting it here. First thing to say is that I'm very excited about this and getting another update during the Zarr meeting the other day was great.

I'm optimistic this could become the foundation of a minimal representation of an image that is useful for >99% of biomedical imaging. The advantage of having this representation is precisely what you say here:

This metadata is easily utilized and elegantly carried through image processing pipelines.

I'm imagining an ecosystem of bioimage analysis tools that use spatial-image as an in memory interchange format and then internally to make smart choices about which axes to process and in what sorts of ways. Because xarray supports dask it could then be used to chain together these tools into a pipeline that could be lazily evaluated.

I think from a napari standpoint we wouldn't require spatial-images just to look at data (you could always just give an an array and we'll show it), but thinking about a bioimage analysis ecosystem it would very helpful to have this additional metadata.

I could also imagine the desire for similar metadata for other primitive objects like points, polygons, or meshes, but that could be a separate conversation.

Given all that I have some additional thoughts on this metadata schema.

As I said on the zoom the other day, I like the ability to have x, y, z, and t as these all make sense to me. I think little is lost by allowing people to have multiple channels named what ever they want them to be named instead of just one c. For example imagine a multispectral fluorescence imaging experiment where you use 3 different lasers and you measure signal on 4 different cameras each with a different filter in front of it, done in 3D and with time. I'd argue that he natural way to represent that data is [laser, camera, t, z, y, x] which is 6D with shape [3, 4, n_t, n_z, n_y, n_x] instead of 5D ['c', 't', 'z', 'y', 'x'] with shape [12, n_t, n_z, n_y, n_x] and the laser and camera dims interleaved. Now it if there was some very simple lazy functionality to reshape between these two representations then maybe it makes sense for an algorithm developer to be able to write an algorithm for the 5D (or even 4D ['t', 'z', 'y', 'x'] chunk) that then got broadcast across the other dimensions. Curious how you think about this?
Could be bake a single affine transforms into this schema too (with units / physical coordinates etc)? I think if we're making assumptions about images being collected on spatial grids then it is very natural to this about physical units and a single affine transform. I concede that more complex transforms like deformable ones etc are probably the purview of more focused tools, but a single affine would be nice. I'd also like to be able to capture things like data from lightsheets microscopes where the data is collected at a funny angle and so needs to be deskewed. This data doesn't quite fit into the x, y, z paradigm quite so nicely until after the affine transform is applied. Curious how you think about this one, or if you think that is too niche to support?

Hope this issue finds you well and you don't mind me popping in here with these thoughts. As I said above I think this is very exciting.

thewtex commented 4 years ago

@sofroniewn thank you for creating this issue and sharing your thoughts! :pray: :1st_place_medal:

I think we are the cusp of an immense leap forward for the imaging community in terms of accessibility and capabilities via elegant handling of our data types, elegantly scaling to large datasets, and access to a vast, powerful ecosystem of tooling for storage, analysis, and visualization. My excitement led me spend more time that I would liked during the Zarr meeting -- I did not intend to eat up that much time, and I am quite interested in the napari world coordinates discussion. I read through the napari GitHub issues, and it seems to be happily be evolving towards convergence. I am interested to get more of your thoughts.

Overall, world coordinates support will help napari in handling annotations, segmentation, etc., which is excellent. From what I can see, there is an understanding in the works that the happy medium compute model for visualization tools is generally a linear model, i.e. affine transforms on the data. Graphics hardware and standard graphics APIs, i.e. OpenGL, Vulkan, are built to support matrix / vector transformations, which also have a good computational complexity / capability tradeoff for visualization.

From an image processing and analysis perspective, the corresponding happy medium in terms of algorithmic complexity and computational performance is a uniformly sampled rectilinear grid. Almost all of the algorithm implementations in scikit-image, itk, imagej, have this assumption. That is, that the underlying pixel data may have a similarity transform associated with it -- an affine transformation constrained to have no skew.

I'm optimistic this could become the foundation of a minimal representation of an image that is useful for >99% of biomedical imaging. The advantage of having this representation is precisely what you say here:

This metadata is easily utilized and elegantly carried through image processing pipelines.

I'm imagining an ecosystem of bioimage analysis tools that use spatial-image as an in memory interchange format and then internally to make smart choices about which axes to process and in what sorts of ways. Because xarray supports dask it could then be used to chain together these tools into a pipeline that could be lazily evaluated.

Yes!!

I think from a napari standpoint we wouldn't require spatial-images just to look at data (you could always just give an an array and we'll show it), but thinking about a bioimage analysis ecosystem it would very helpful to have this additional metadata.

Cool, that is a great approach.

I think little is lost by allowing people to have multiple channels named what ever they want them to be named instead of just one c. For example imagine a multispectral fluorescence imaging experiment [...] [laser, camera, t, z, y, x]

Yes, good point.

Two action items come to mind:

The interface and checks allow additional dims to be present.
We have a set of examples datasets and use cases like the one you proposed to verify we are addressing real use cases and provide a tangible demonstration of how this works.

I will add support for the first to the current WIP branch. I created Issue #2 to track the second.

What do you think?

Could be bake a single affine transforms into this schema too (with units / physical coordinates etc)? [...]

Todo on the WIP branch is to add a 'direction' / 'orientation' matrix. This is to be multiplied on the offset and spacing implied by the coords to give the world position. However, to satisfy the assumption of most image processing libraries, discussed above, this is an orthonormal matrix. The orthonormal matrix with coords results in a similarity transforms. This is complexity-functionality-common-ground for scientific image processing and visualization tools so researchers will have access to both. We can certainly still support the deskew required in light sheet microscopy or more complex deformable transformations, but we can do this by providing the tooling to resample on a uniform rectilinear grid. With Python distributed and GPU compute capabilities, we should be able to do this on-demand for visualization or processing. Examples will help here, too.

We are doing well, considering, here in North Carolina, although we start a state-wide Stay-at-Home Governor's directive today. I hope all is well in California -- the situation is more challenging, and folks in San Francisco and New York City are in our thoughts and prayers.

sofroniewn commented 4 years ago

Two action items come to mind:

The interface and checks allow additional dims to be present. We have a set of examples datasets and use cases like the one you proposed to verify we are addressing real use cases and provide a tangible demonstration of how this works. I will add support for the first to the current WIP branch. I created Issue #2 to track the second.

What do you think?

Sounds good to me, i'll monitor and follow up there

The orthonormal matrix with coords results in a similarity transforms. This is complexity-functionality-common-ground for scientific image processing and visualization tools so researchers will have access to both.

Ok this makes sense to me, I think napari may then default on similarity transforms (as opposed to affine), and bundle deskew with more advanced functionality like deformable. I'll loop you in when we add support for 'direction' / 'orientation', so we can be on the same page.

Also, thanks for your kind considerations about CA, glad to hear doing ok in NC too.

spatial-image / spatial-image

some thoughts #1