Transformation types - Githubissues

bogovicj commented 2 years ago

Details here, examples forthcoming.

The v0.4 specification declares the types: identity, translation, and scale.

Version 0.5 should include new types of transformations. Here is a preliminary list, ordered approximately by importance / urgency / utility (perceived by me).

1) affine 2) rigid 3) axis_permutation 4) displacement_field 5) position_field 6) inverse_of : when transforms applied to images are not closed-form-invertible 7) bijection: make an invertible transform by explicitly providing the forward and inverse 8) sequence: a list of transforms applied in order 9) log 10) exp 11) gamma 12) cartesian_to_polar 13) polar_to_cartesian 14) cartesian_to_spherical 15) spherical_to_cartesian 16) cartesian_to_cylindrical 17) cylindrical_to_cartesian

Questions

Which of the above transforms do we include in 0.5?
- my vote is for "all of the above" (or maybe 1-8)
Is sequence necessary?
- e.g. if every transformation provided as a list (and most are length 1), then we don't need it.
How should rigid transforms be parametrized?
- should multiple parametrizations be allowed? @thewtex @lassoan

@constantinpape @xulman @tischi @axtimwalde @tpietzsch @d-v-b @jbms @satra

lassoan commented 2 years ago

Linear transforms

I would not recommend to introduce separate transform types for affine, rigid, axis_permutation, etc, just a simple linear transformation, described with a homogeneous transformation matrix (4x4; or 3x4 if we omit the last line).

In application software they would all be implemented as linear and it is often not trivial to convert it to a more specific transform type.

It seems simple, but years of experience with the NIFTI file format shows that it is a problem that is almost impossible to solve correctly. A common issue is that due to numerical inaccuracies most of the time images have slightly non-orthogonal axes, so you need to define tolerance metrics that you use to decide if the axes are orthogonal, unit-length, etc. and based on that decide if you write it out as a rigid transform (discarding the accurate orientation and scaling) or you write it out as affine (keeping all the values accurate). This is an open problem for over 20 years, there is still no universal solution that works well for all use cases.

Many transform types also puts unnecessary burden on application developers - we would need to implement readers and writers for many transform types.

If you introduce a new transform type for each parameterization of a 4x4 matrix then you cannot stop at just affine, rigid, axis_permutation, but you'll have to add all the other commonly used parameterizations, as it is done in ITK:

Of course, it is just software, everything is doable, but still, implementing 15 transform types (instead of just 1) for representing a simple linear transform is significant workload. Most likely, each application would choose to implement just a subset, ending up with incompatibilities and many not-well-tested code branches in file I/O source code. Overall, leading to unhappy users and developers.

Other transforms

position_field: I have not heard about this. Are you sure this is needed?
displacement_field: make sure there is a way to specify interpolation type (linear and bspline are both very commonly used)
inverse_of: I've been thinking about if this was sufficient or we also need an "inverse" flag property inside every a transform. I've concluded that this should be sufficient, because even if applications use transform classes that has the inverse flag as a transform property (such as in VTK), it will not be hard to convert this inverse_of parent transform to flipping the "inverse" flag on all child transforms.
bijection: It must be possible for all transforms to compute its inverse. Without that a transform is barely usable. For example, if you want to apply a transform to an image and on a polyline that is drawn on the image, then you must use the forward transform (modeling transform) to transform the polyline points; and you must use the reverse (resampling transform) to transform the image. Therefore, I don't think we need a separate bijection type transform.
sequence: It is useful if we can group multiple transforms in a sequence. For example, you can apply inverse_of to an entire sequence, read/write a sequence to a single file, etc. I slightly prefer the composite name (see ITK), as an item in a sequence can contain another sequences, so you end up with a tree structure.
landmark: This essential transform type was missed. This transform specifies a list of corresponding points (randomly distributed in space) and a transformation mode. Most common ones are linear (rigid, similarity, or affine) and thin-plate spline.

bogovicj commented 2 years ago

Thanks for having a look at this @lassoan

Linear transforms

I would be happy not to include rigid in favor of affines only if that is the consensus. iirc, @thewtex mentioned rigid specifically on the last call, so would want to hear from him.

There is some value in other simpler parametrizations though - i.e. we should keep scale and translate. Agreed?

`displacement_field`

displacement_field make sure there is a way to specify interpolation type

Good point, agreed.

`position_field`

Good question. (p.s. Let's consider calling this coordinates or coordinate(s)_field)

I include it because it's used by:

This is also how I imagine explicitly giving coordinates when they are not evenly spaced, for example, if our time points are at 0, 100, and 500 seconds then it's described by:

{
    "type" : "position_field",
    "positions" : [0, 100, 500 ]
    "input_space" : "",
    "output_space" : "time_points"
}

`inverse_of` and `bijection`

or we also need an "inverse" flag property inside every a transform.

Yea, I considered this too, but decided on the approach above that you see. @axtimwalde prefers the inverse_of to a flag.

It must be possible for all transforms to compute its inverse. Without that a transform is barely usable.

I disagree, or rather, I don't think it's up to us to decide. Often, transforming the image is all that is needed / desired.

Also, there's no "one standard way" I know of to invert an arbitrary displacement field - so asking this of implementations makes it at least as hard as implementing all the linear transforms above that you (rightly) don't want to force people to do.

`landmark`

I completely agree that storing landmarks is valuable, but don't think it belongs with transformations. If landmarks are used to estimate an affine transformation say, why not just call it an affine? For me, storing landmarks as a transform would mix the transformation from how we get the transformation.

Rather, I think the landmarks themselves should be described by shapes, meshes, and/or tables.

I do agree that thin_plate_spline is worth including. We (in imglib2 and relate libs) either store all the moving+target landmarks and recompute the coefficients after loading, or store one set of landmarks + coefficients. What does ITK do?

Either way the landmarks will be important. So let's coordinate with the folks working on shapes, meshes, and tables for this.

tischi commented 2 years ago

There is some value in other simpler parametrizations though - i.e. we should keep scale and translate. Agreed?

I think one reason to do this was that some applications can only consume those simple transformations (scale and translate). However, I found the suggestion made in one of the last NGFF calls that then those applications could just pull out the scale and translation from the affine worth considering. So, even if it may break our current spec, I wonder, given the 20+ years of experience of @lassoan, whether we should reconsider to only support affine on the spec level (APIs could support more and then translate from and to affine).

lassoan commented 2 years ago

Getting the scale from the transformation matrix is very simple (scale[i] is np.linalg.norm() of the i-th column of the transformation matrix). The ngff library can also provide convenience functions for getting/setting scale from/to the transformation matrix.

axtimwalde commented 2 years ago

I am with @bogovicj and others to support explicit subsets of affine transformations. I never found it helpful to remove information to later rediscover it and to deal with the associated inaccuracies. If a transformation is linearly independent (such as translations and scalings), then it should say so because an application can do helpful shortcuts when dealing with them. E.g. rendering transformed data is much faster and easier. If a transformation is meant to be orthogonal (similarities) or even orthonormal (rigid), then it is helpful to know this instead of guessing it from noisy projections. Applications that handle only affine transformations are free to convert first and then do their thing. This could indeed be a tool written in jQuery or a jQuery based translation layer. Proposed name: "see-everything-that-is-an-affine-as-an-affine".

constantinpape commented 2 years ago

Thanks for working on this @bogovicj, a few first comments from my side:

inverse_of : when transforms applied to images are not closed-form-invertible

bijection: make an invertible transform by explicitly providing the forward and inverse

I think both of these are not so easy to understand. That does not mean we should not include them, but they will need some more motivation, explanation and examples.

sequence: a list of transforms applied in order

If we stick with the current way of specifying transformations in 0.4, then sequence is not necessary; whenever transformations are given in the spec they should be given as a list. I would be open to change this, but I think we should have only one of the two potential solutions, i.e. either List[CoordinateTransformation] or only a single CoordinateTransformation and the option to use Sequence. And we should only change it if there is a concrete advantage of having the explicit Sequence instead of using a list. I can't think of any advantage right now, but happy to be enlightened about this ;).

log

exp

gamma

I think that these are not really CoordinateTransformations, but rather ValueTransformations (IntensityTransformations), something we have not introduced yet. For simplicity I would suggest to leave these out of the current proposal and introduce them at a later stage if necessary.

cartesian_to_polar

polar_to_cartesian

cartesian_to_spherical

spherical_to_cartesian

cartesian_to_cylindrical

cylindrical_to_cartesian

I am not sure yet how we represent non-cartesian spaces in #94 yet. Maybe it's simpler to leave these out for now as well. But I am happy to change my mind on this if the solution to this is simple.

Regarding affine transformations and subsets thereof: I fully agree with @axtimwalde's comment https://github.com/ome/ngff/issues/101#issuecomment-1046327447 that being able to specify the explicit subset is better than needing to extract this information from the full affine representation. The forward translation of going from scale / translation / similarity / rigid to affine is much simpler than going backward from affine to a given subtype. If we limit ourselves to affines it will make implementation much harder for any software that does not support a full affine (or can make use of subtypes for better performance).

axtimwalde commented 2 years ago

Examples for inverse_of:

a deformation field to register an image to another image
a thin-plate-spline deformation created by BigWarp that registers an image to another image
a degenerate affine transformation or projection that projects all pixels of 3D space into a 2D plane

Examples of bijection:
a deformation field to register image A to image B and an approximate inverse deformation field of that deformation field that registers image B to image A
a thin-plate-spline deformation created by BigWarp that registers image A to image B and an approximate inverse deformation field of that thin-plate-spline transformation that registers image B to image A

The need for sequence:

We will eventually support references to transformations that are re-used multiple times. This saves both storage and makes it explicit that a specific transformation is being used. Transformations used for specific datasets can then be constructed as a combination of references and explicitly stored transformations. The referenced transformations can be single transformations or sequences of transformations and may themselves contain references to transformations. This whole structure means that transformations are trees that, when applied are flattened and applied as a sequence. The cleanest way to do this is to enable leaf transformations and sequences and references (to leafs or sequences) and understand them all as the same kind of node, a transformation. Best example for me: lens distortion correction for stitched EM or confocal images. The distortion correction consists of a non-linear polynomial transformation and an affine transformation that normalizes between color channels (confocal) or across cameras (EM), i.e. it is a sequence. The same lens distortion correction transformation is re-used by thousands of tiles in the stitched dataset. We may improve the lens-disortion correction at a later time with better calibration data and would then update only one instance instead of thousands. Each tile also has a rigid or general affine transformation that stitches it into a global montage.

log, exp, gamma apply to coordinates just as well as to continuous value domains and are therefore coordinate transformations.

Non-cartesian image data is abundant in medical imaging and must therefore be supported. The data arrays are just as multi-dimensional as microscopy acquisitions. Good practical example: Ultrasound -scanner data.

jbms commented 2 years ago

Having multiple ways of specifying an affine transform adds a small amount of complexity but is indeed relatively easy to handle when reading. It is similarly true that it is easy to deal with axisIndices or similar being specified for the input and output dimensions of the affine transform (or other transforms). I will note though that for affine transforms it is quite easy to "read off" the axisIndices --- they are indicated by the presence of zeros for non-diagonal coefficients and one for diagonal coefficients. Even if you normalize the matrix, the zeroes will stay zero and the ones will stay one, so there isn't a risk of floating point precision issues.

However, I am less convinced that it will actually reduce implementation complexity even if you support optimizations in the scale-translation-only case, because in practice you will likely have to compose multiple transformations and an affine transform matrix is the simplest way to do that composition. Then in the final transform matrix you can check for whatever conditions you have optimizations for. Of course if there are non-linear transforms this sort of composition is not possible, but those transforms will have to be supported in a less efficient way (or not supported at all), and you would still want to compose any affine transforms before and after each non-linear transform.

One issue I can foresee, related to what @lassoan said, is that if there are multiple ways to represent affine transforms, but some ome-zarr implementations support only some of those representations, or support them more efficiently, then when writing a transformation you will have to be aware of which representations are supported/more efficient by each implementation. For example, if some viewer only supports translation and scale transformations but does not support affine transformations, then writing software will have to make sure to attempt to convert any affine transform to a scale and translation transform if possible. Similarly if some implementations are more efficient if you specify axisIndices explicitly, then writing software that uses an affine transform representation will have to extract out the axisIndices. Perhaps we can address this issue in the standard, either by:

encouraging implementations to behave the same regardless of how affine transforms are specified (but in this case having multiple representations is kind of pointless); or
specify a "normalized" representation that must be used to ensure maximum optimization potential (e.g. maximal use of axisIndices, translation-and-scale-only affine transforms must be converted to separate translation and scale transforms).

bogovicj commented 2 years ago

Here is a brief summary of some examples.

I've started a prototype implementation with more details here: https://github.com/bogovicj/ngff-transforms-prototype

Some possible changes

Considering changing from space to coordinateSystem @d-v-b
- is a more descriptive name and a better match coordinateTransformations
Every array / dataset gets a default space / coordinateSystem
- it's name is the array's path in the container
- axes names / types are shared among all default array-spaces.

Basic example

Pixel to physical space, and an simple affine between two physical spaces (scanner vs anatomical) for our medical imaging friends.

Basic example metadata

```json { "spaces": [ { "name": "scanner", "axes": [ { "type": "space", "label": "x", "unit": "millimeter", "discrete": false }, { "type": "space", "label": "y", "unit": "millimeter", "discrete": false }, { "type": "space", "label": "z", "unit": "millimeter", "discrete": false } ] }, { "name": "LPS", "axes": [ { "type": "space", "label": "LR", "unit": "millimeter", "discrete": false }, { "type": "space", "label": "AP", "unit": "millimeter", "discrete": false }, { "type": "space", "label": "IP", "unit": "millimeter", "discrete": false } ] } ], "coordinateTransformations": [ { "scale": [ 0.8, 0.8, 2.2 ], "type": "scale", "name": "to-mm", "input_space": "/basic/mri", "output_space": "scanner" }, { "affine": [ 0.9975, 0.0541, -0.0448, 0, -0.05185, 0.9974, 0.0507, 0, 0.04743, -0.04824, 0.99771, 0 ], "type": "affine", "name": "scanner-to-anatomical", "input_space": "scanner", "output_space": "LPS" } ] } ```

Crop / cutout example

This example has two 2d datasets,

img2d - some 2d image
img2dcrop a cropped / offset copy of the above

In addition to the default pixel spaces, there are:

"physical" : physical space of complete image
"crop-offset" : the crop transformed to the whole in pixel units
"crop-physical" : the crop transformed to the the whole in physical units

Crop example metadata

```json { "spaces": [ { "name": "physical", "axes": [ { "type": "space", "label": "x", "unit": "micrometer", "discrete": false }, { "type": "space", "label": "x", "unit": "micrometer", "discrete": false } ] }, { "name": "crop-offset", "axes": [ { "type": "space", "label": "ci", "unit": "", "discrete": true }, { "type": "space", "label": "cj", "unit": "", "discrete": true } ] }, { "name": "crop-physical", "axes": [ { "type": "space", "label": "cx", "unit": "micrometer", "discrete": false }, { "type": "space", "label": "cy", "unit": "micrometer", "discrete": false } ] } ], "coordinateTransformations": [ { "name": "to-physical", "type": "scale", "scale": [ 2.2, 1.1 ], "input_space": "/crop/img2d", "output_space": "physical" }, { "name": "to-crop-physical", "type": "scale", "scale": [ 2.2, 1.1 ], "input_space": "/crop/img2dcrop", "output_space": "crop-physical" }, { "name": "offset", "type": "translation", "translation": [ 10, 12 ], "input_space": "/crop/img2dcrop", "output_space": "/crop/img2d" } ] } ```

Multiscale

A multiscale dataset. The only change of note compared to v0.4 is the addition of a "space", the associated fields for the coordinateTransformations. This example shows what results might look like if downsampling performed by averaging - therefore introducing a sub-pixel offset.

~~I'm not so happy with how the sequence transform interacts with the "coordinateTransformations":[] list, but will deal with that later, suggestions welcome.~~

Edits:

No longer use sequence transform type, but rather coordinateTransformations list as v0.4
Moved spaces tag outside of multiscales metadata
- though it could stay inside, I don't have a strong preference

Example multiscale metadata (lightly edited)

```json { "spaces": [ { "name": "physical", "axes": [ { "type": "space", "label": "x", "unit": "um", "discrete": false }, { "type": "space", "label": "y", "unit": "um", "discrete": false } ] } ], "multiscales": [ { "version": "0.5-prototype", "name": "ms_avg", "type": "averaging", "datasets": [ { "path": "/multiscales/avg/s0", "coordinateTransformations": [ { "scale": [ 2.2, 3.3 ], "type": "scale" }, "name": "s0-to-physical", "input_space": "/multiscales/avg/s0", "output_space": "physical" ] }, { "path": "/multiscales/avg/s1", "coordinateTransformations": [ { "scale": [ 4.4, 6.6 ], "type": "scale" }, { "translation": [ 1.1, 1.65 ], "type": "translation" } ], "name": "s1-to-physical", "input_space": "/multiscales/avg/s1", "output_space": "physical" }, { "path": "/multiscales/avg/s2", "coordinateTransformations": [ { "scale": [ 8.8, 13.2 ], "type": "scale" }, { "translation": [ 3.3, 4.95 ], "type": "translation" } ], "name": "s2-to-physical", "input_space": "/multiscales/avg/s2", "output_space": "physical" } ] } ] } ```

Example discrete multiscale metadata

```json { "spaces": [ { "name": "physical", "axes": [ { "type": "space", "label": "x", "unit": "um", "discrete": false }, { "type": "space", "label": "y", "unit": "um", "discrete": false } ] } ], "multiscales": [ { "version": "0.5-prototype", "name": "ms_discrete", "type": "discrete", "datasets": [ { "path": "/multiscales/discrete/s0", "coordinateTransformations": [] }, { "path": "/multiscales/avg/s1", "coordinateTransformations": [ { "scale": [ 2, 2 ], "type": "scale" }, ], "name": "s1-to-s0", "input_space": "/multiscales/discrete/s1", "output_space": "/multiscales/discrete/s0", }, { "path": "/multiscales/avg/s2", "coordinateTransformations": [ { "scale": [ 4, 4 ], "type": "scale" } ], "name": "s2-to-s0", "input_space": "/multiscales/discrete/s2", "output_space": "/multiscales/discrete/s0", } ], "coordinateTransformations" : [ { "scale": [ 0.8, 1.1 ], "type": "scale" }, ], "input_space": "/multiscales/avg/s0", "output_space": "physical" } ] } ```

This alternative maps downsampled arrays (s1,s2) to the highest resolution array (s0). Note the changes to output_space, and discrete values for scale parameters. This example assumes downsampling was performed in such a way that avoids an offset. If downsampling introduces an offset (even sub-pixel), it MUST include the appropriate translation as in the example above.

This example also includes a "global" coordinateTransform to physical space. Note that its input_space is the "array space" for the highest resolution (s0). A coordinateTransform from s[i] to "physical" is implicitly defined
by the path s[i]-to-s0 -> s0-to-physical. The "global" coordinateTransform is optional.

Example discrete multiscale metadata with shorthands

```json { "multiscales": [ { "version": "0.5-prototype", "name": "ms_discrete", "type": "discrete", "datasets": [ { "path": "/multiscales/discrete/s0" }, { "path": "/multiscales/avg/s1", "coordinateTransformations": [ { "scale": [ 2, 2 ], "type": "scale" }, ] }, { "path": "/multiscales/avg/s2", "coordinateTransformations": [ { "scale": [ 4, 4 ], "type": "scale" } ] } ] } ] } ```

This final example omits the global coordinateTransforms, and spaces / axes, but is otherwise identical to the v0.4 multiscale specification, but is other identical to the above example.

Shorthands:

EXACTLY one object in the datasets list has an empty/null coordinateTransformations list (or no coordinateTransformations field)
the path for that dataset is the output_space for all other coordinateTransformations
the input_space for every other dataset is implicitly understood to equal its path (as in the above examples)

Example multiscale metadata with multiple spaces

```json { "spaces" : [ { "name": "physical", "axes": [ { "type": "space", "label": "x", "unit": "um", "discrete": false }, { "type": "space", "label": "y", "unit": "um", "discrete": false } ] }, { "name": "anatomical", "axes": [ { "type": "space", "label": "LR", "unit": "um", "discrete": false }, { "type": "space", "label": "AS", "unit": "um", "discrete": false } ] } ], "coordinateTransformations" : [ { "name" : "s0-to-physical", "type" : "scale", "scale" : [ 0.8, 2.2 ], "input_space" : "/multiscales/discrete/s0", "output_space" : "physical" }, { "name" : "physical-to-anatomical", "type" : "affine", "affine" : [ 0.8, 0.05, -3.4, 0.08, 0.91, 10.2 ], "input_space" : "physical", "output_space" : "anatomical" }, ], "multiscales": [ { "version": "0.5-prototype", "name": "ms_discrete", "type": "discrete", "datasets": [ { "path": "/multiscales/discrete/s0" }, { "path": "/multiscales/avg/s1", "coordinateTransformations": [ { "scale": [ 2, 2 ], "type": "scale" }, ] }, { "path": "/multiscales/avg/s2", "coordinateTransformations": [ { "scale": [ 4, 4 ], "type": "scale" } ] } ] } ] } ```

The multiscales data for this example is identical to the example above with shorthands, but in addition, it includes two spaces ("physical" and "anatomical"), and a coordinateTransformation going from "physical" to "anatomical".

~~Original Example multiscale metadata (now deprecated)~~

```json { "multiscales": [ { "version": "0.5-prototype", "name": "ms_avg", "type": "averaging", "datasets": [ { "path": "/multiscales/avg/s0", "coordinateTransformations": [ { "scale": [ 2.2, 3.3 ], "type": "scale", "name": "s0-to-physical", "input_space": "/multiscales/avg/s0", "output_space": "physical" } ] }, { "path": "/multiscales/avg/s1", "coordinateTransformations": [ { "transformations": [ { "scale": [ 4.4, 6.6 ], "type": "scale" }, { "translation": [ 1.1, 1.65 ], "type": "translation" } ], "type": "sequence", "name": "s1-to-physical", "input_space": "/multiscales/avg/s1", "output_space": "physical" } ] }, { "path": "/multiscales/avg/s2", "coordinateTransformations": [ { "transformations": [ { "scale": [ 8.8, 13.2 ], "type": "scale" }, { "translation": [ 3.3, 4.95 ], "type": "translation" } ], "type": "sequence", "name": "s2-to-physical", "input_space": "/multiscales/avg/s2", "output_space": "physical" } ] } ], "spaces": [ { "name": "physical", "axes": [ { "type": "space", "label": "x", "unit": "um", "discrete": false }, { "type": "space", "label": "y", "unit": "um", "discrete": false } ] } ] } ] } ```

Non-linear registration

The example code produces two 3d datasets of different drosophila template brains:

/registration/fcwb
/registration/jrc2018F

and two displacement fields:

/registration/fwdDfield
/registration/invDfield

The spaces and transformations are related like this:

 /registration/jrc2018F <-toJrc2018F->  jrc2018F <-"jrc2018F-to-fcwb"->  fwcb <-toFcwb-> /registration/fcwb

where A <-T-> B indidcates an invertible transformation (named T) between spaces A and B.

In this example, the "forward" direction of the transformation "jrc2018F-to-fcwb" is a sequence: a displacement field (fwdDfield) followed by an affine. The inverse is therefore the inverse of that affine followed by the inverse of the displacement field (invDfield).

The registration metadata

```json { "spaces": [ { "name": "fcwb", "axes": [ { "type": "space", "label": "fcwb-x", "unit": "um", "discrete": false }, { "type": "space", "label": "fcwb-y", "unit": "um", "discrete": false }, { "type": "space", "label": "fcwb-z", "unit": "um", "discrete": false } ] }, { "name": "jrc2018F", "axes": [ { "type": "space", "label": "jrc2018F-x", "unit": "um", "discrete": false }, { "type": "space", "label": "jrc2018F-y", "unit": "um", "discrete": false }, { "type": "space", "label": "jrc2018F-z", "unit": "um", "discrete": false } ] } ], "coordinateTransformations": [ { "forward": { "transformations": [ { "path": "/registration/fwdDfield", "type": "displacement_field" }, { "affine": [ 0.907875, 0.00299018, 0.00779285, -3.77146, -0.000121014, 1.04339, 0.0893289, -6.39702, 0.000127526, -0.0138092, 0.549687, 2.9986 ], "type": "affine" } ], "type": "sequence", "name": "jrc2018F-to-fcwb", "input_space": "jrc2018F", "output_space": "fcwb" }, "inverse": { "transformations": [ { "affine": [ 1.1014748899286995, -0.003356093187801388, -0.015070089856986017, 4.177888664571422, 0.00014930742384645888, 0.9563570184920926, -0.1554184181171034, 6.584435749976974, -0.00025178851007148946, 0.024026315573955494, 1.8153162032371448, -5.290659956068192 ], "type": "affine" }, { "path": "/registration/invDfield", "type": "displacement_field" } ], "type": "sequence", "name": "fcwb-to-jrc2018F", "input_space": "fcwb", "output_space": "jrc2018F" }, "type": "bijection", "name": "jrc2018F<>fcwb", "input_space": "jrc2018F", "output_space": "fcwb" } ] } ```

the forward displacement field's metadata

```json { "spaces": [ { "name": "forwardDfield", "axes": [ { "type": "displacement", "label": "d", "unit": "", "discrete": true }, { "type": "space", "label": "x", "unit": "um", "discrete": false }, { "type": "space", "label": "y", "unit": "um", "discrete": false }, { "type": "space", "label": "z", "unit": "um", "discrete": false } ] } ], "transformations": [ { "scale": [ 1, 1.76, 1.76, 1.76 ], "type": "scale", "name": "fwdDfieldScale", "input_space": "/registration/fwdDfield", "output_space": "fwdDfield" } ] } ```

jbms commented 2 years ago

The multiscale example the schema you have shown seems to allow multiple coordinate transforms for each scale and multiple coordinate spaces for the multiscale.

Is that something you specifically intended to support?

bogovicj commented 2 years ago

allow multiple coordinate transforms for each scale ... Is that something you specifically intended to support

No, every level gets one transform. The v0.4 spec gives every level an array of coordinate transforms that are meant to be applied as a sequence. I have not yet decided how to reconcile that with the proposed scheme. I need to clean up / clarify this.

jbms commented 2 years ago

The "spaces" property of the items of the "multiscales" array is also an array --- but are you saying that is also intended to be just a single item?

What do you imagine the use would be for the "name" given to each of the scale's coordinate transforms --- is that intended to allow something outside of that particular multiscale definition to reuse that coordinate transform?

bogovicj commented 2 years ago

Forgive me for not giving a great answer now - a good answer means describing how I intend to use the spec, i.e. how the it enables having a nice API (in my view). I will write that up in a longer form soon, but wanted to get some examples out there first.

In short:

there could be more than one space (this is like the "global" transform discussed here, one space refers to the data before the global transform, another refers to the data afterward)
yes, I think referring to coordinate transforms by name is useful, re-using them is a good example of this
- this re-use is not so important for multiscales in particular, but I include the names here anyway for completeness, it may be that names are optional

bogovicj commented 2 years ago

I've updated and added new multiscales examples to the comment above (preserving the originals for the record).

Changes and new examples:

The original example is cleaner (no longer uses sequence transform type)
A "global" coordinateTransformations
discrete scaling factors
multiple spaces

thewtex commented 2 years ago

@bogovicj thanks for working on this! I really like the global coordinateTransformations and specification of spaces in particular.

new types of transformations. Here is a preliminary list,

affine rigid axis_permutation displacement_field

I agree that these are a priority, but I would also add rotation (possibly as a replacement for rigid) because of its importance in medical imaging.

I would not recommend to introduce separate transform types for affine, rigid, axis_permutation, etc, just a simple linear transformation, described with a homogeneous transformation matrix (4x4; or 3x4 if we omit the last line).

As @axtimwalde and others have mentioned, while it is possible to represent scale, translation, rigid, etc. inside an affine transformation, it is not possible to know immediately whether an affine transformation does not contain shearing, for example. And, while affine composition can easily and universally be achieved with little computational overhead, decomposition depends on the availability of more advanced tools and the result depends on the method and values (it is "noisy" as @axtimwalde ), e.g. a negative sign in the scale, multiplying negatives in the rotation.

A common issue is that due to numerical inaccuracies

Numerical inaccuracies are an important consideration, but storing transformation parameters as binary 64-bit IEEE float's instead of ascii decimal is the way to minimize this.

If you introduce a new transform type for each parameterization of a 4x4 matrix then you cannot stop at just affine, rigid, axis_permutation, but you'll have to add all the other commonly used parameterizations, as it is done in ITK:

Supporting different basic transformation types, i.e. scale, translation, rotation, is different from supporting different representations of transformation types. We should support different transformation types but not different representations of those types.

ITK supports different representations of transformation types for the purposes of registration / optimization. I agree with @lassoan in that we want to keep representations as minimal and as simple as possible. I do not think we should require different representations to be supported in NGFF for standard simplicity and for the simplicity of implementing software. A representation of a transformation is euler angles vs versor for a rotation or 4x4 vs 3x4 matrix for an affine. We should pick one representation for transformation types, define it well, and provide documentation and example code on how to convert it to other representations.

jbms commented 2 years ago

Regarding binary vs text representation of floating point numbers, while it is certainly easy to lose precision when converting to text representation, it is also quite possible to convert losslessly --- there is no need to use a binary representation just for that purpose. In particular the Python json.dumps function is lossless except for nan and infinity.

thewtex commented 2 years ago

Yes, while it is possible to convert binary to text floating point losslessly, there are issues and limitations that are not always handled ideally by every language / library / implementation. We found that in practice, ITK needed to use the Google Double Conversion library when serializing / deserializing transform parameters to text transform file formats to avoid a loss of information that interfered with results.

imagesc-bot commented 2 years ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/ome-ngff-community-call-transforms-and-tables/71792/1

imagesc-bot commented 1 year ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/save-irregular-time-coordinates-in-ome-zarr/82138/2

ome / ngff

Transformation types #101

Questions

Linear transforms

Other transforms

Linear transforms

`displacement_field`

`position_field`

`inverse_of` and `bijection`

`landmark`

Some possible changes

Basic example

Crop / cutout example

Multiscale

Non-linear registration

ome / ngff

Transformation types #101

Questions

Linear transforms

Other transforms

Linear transforms

displacement_field

position_field

inverse_of and bijection

landmark

Some possible changes

Basic example

Crop / cutout example

Multiscale

Non-linear registration

`displacement_field`

`position_field`

`inverse_of` and `bijection`

`landmark`