imageio / imageio

Python library for reading and writing image data
https://imageio.readthedocs.io
BSD 2-Clause "Simplified" License
1.5k stars 295 forks source link

Standardized meta data fields #382

Closed almarklein closed 2 years ago

almarklein commented 6 years ago

See #362 for the meta-data overview issue.

Note: Originally the shape of the meta dict was part of this issue, but I moved that to #502.

The question: what fields to include as standard meta data (and how to name them)?

Lists and names up for discussion!

I'd rather start small and extend later. Pretty sure:

almarklein commented 6 years ago

An alternative might be to add fields in the root of the existing meta dict, and prefix them, like:

im.meta['$shape']   # let's pray that no image format prefixes meta attributes with dollar signs
almarklein commented 6 years ago

cc @lschr @jni @jakirkham

thewtex commented 6 years ago

Hi @almarklein , this is great :+1:

+1 for origin and a request for direction -- an orientation matrix is important, especially for medical images.

jni commented 6 years ago

@thewtex a great point, but can you point to a good user guide for this? I find it hard to keep this thing straight in my head, and have been lucky not to have had to deal with it in my research until now. =)

@almarklein

axes (but we always return zyx data, I think)

This is still important to convey in the metadata

Colorspace info (but we always return RGB, or doesnt it?

In microscopy, it might be totally different.

Three methods were named

After I understood the third method, that became my favourite. It removes altogether the possibility of name clashes, is not too onerous for people who want access to the raw data, and iterating across either metadata dictionary becomes well-defined (no surprise fields).

almarklein commented 6 years ago

What does axes, look like? A string being either zyx or xyz?

What does colorspace info look like, a tuple with descriptions like this?

info["channels"] = ("red", "green", "blue")

After I understood the third method, that became my favourite.

This would be that the main meta dict contains the standardized fields, plus one raw field for the original data. I like that approach the best as well, but unfortunately it would break backwards compatibility for imageio, which is why I'm leaning towards adding a new field to the existing metadata dict with a unique enough name.

jni commented 6 years ago

@almarklein

what does axes look like?

A string which is a permutation of "tzyxc". For example, Leica microscopes save their files in acquisition order, which results in "tzcyx".

what does colorspace info look like

Yeah, that's ok. We can leave this a bit more freeform. Multispectral imaging might have a tuple of wavelengths instead of colour names. In microscopy, people typically save the emission wavelength.

unfortunately it would break backwards compatibility for imageio

GAH. Well, this is always a hard choice, but given my thus-far light dependency on imageio, I'd still favour ripping the bandaid hard. =P You could deprecate it quite effectively by turning .meta into a property instead of a dictionary on the Image object.

almarklein commented 6 years ago

by turning .meta into a property

What do you mean? .meta is now a (dict) property of the Image object.

thewtex commented 6 years ago

The image direction is defined to be the direction cosine matrix that defines an image's orientation about its origin. Or, from the ITK Software Guide:

The image direction matrix represents the orientation relationships between the image samples and physical space coordinate systems. The image direction matrix is an orthonormal matrix that describes the possible permutation of image index values and the rotational aspects that are needed to properly reconcile image index organization with physical space axis. The image directions is a NxN matrix where N is the dimension of the image. An identity image direction indicates that increasing values of the 1st, 2nd, 3rd index element corresponds to increasing values of the 1st, 2nd and 3rd physical space axis respectively, and that the voxel samples are perfectly aligned with the physical space axis.

For a given index I3X1, the physical location P3X1 is calculated as follows:  image

where D is an orthonormal direction cosines matrix and S is the image spacing diagonal matrix.

In medical imaging, it is common to roughly align the data collection axes with the patient, i.e. superior-inferior, anterior-posterior, left-right. The direction matrix is also often used to flip or rotate the image content appropriately.

The direction, like spacing and origin, are critical for operations like registration.

Since the pixel collection axes are not necessarily aligned with the world coordinate axes, pixel axes labeling with ijk as opposed to xyz can help avoid confusion. I know xyz is conventional in microscopy and ImageJ, but the pixel versus world coordinate system distinction is quite helpful to avoid confusion.

jni commented 6 years ago

@almarklein ah, right, I didn't realise it was already a property instead of just a straight dict. Well, then it's quite easy to deprecate, no? Some variant of the below:

@property
def meta(self):
    if self._old_meta_warning:
        warn("Array.meta will use standardized fields from version 3.0. "
             "Use Array.meta['raw'] to access the raw data fields. ",
             "See [corresponding docs url] for more info.", DeprecationWarning)
    return self._meta

Then _old_meta_warning could be set by some global config file, or at startup, or at import, so that people using the new syntax could silence the warning.

@thewtex

Since the pixel collection axes are not necessarily aligned with the world coordinate axes, pixel axes labeling with ijk as opposed to xyz can help avoid confusion. I know xyz is conventional in microscopy and ImageJ, but the pixel versus world coordinate system distinction is quite helpful to avoid confusion.

This is great.

And thanks for the orientation explanation, very useful. I agree that we could include it and then just have it be the identity matrix for image formats where it does not make sense.

almarklein commented 6 years ago

@jni, maybe we can do something like this:

@property
def meta(self):
    """A dict containing the raw meta-data, as it always was."""
    ...

@property
def info(self):
    """A dict containing standardized meta-data, including a field `raw` which corresponds to self.meta."""
    ...

This way there should be no deprecation issues at all, and the info dict would match skimage's approach (if that's what skimage will adopt).

jni commented 6 years ago

Works for me! =)

GenevieveBuckley commented 6 years ago

Image metadata fields I use most day-to-day include:

(Note that some types of transformations can change the effective voxel size. I am not sure how best to harndle this type of situation in terms of metadata.)

jni commented 6 years ago

@GenevieveBuckley

it's unclear to me if this is the same thing as "spacing"

I used to favour "voxel size" as the terminology until I read this paper. "spacing" makes more sense with the "point sample" model of pixels, and indeed it generalises to frame intervals (though perhaps those are best kept separate anyway).

almarklein commented 6 years ago

I've also used/seen the term sampling, but I think spacing is more common for images.

almarklein commented 4 years ago

@jni I'm starting to get the impression that the whole main about this meta data is to be able to position and orient the image data in 3D space (or space-time).

The one exception is colorspace (I opened #504 to discuss that).

FirefoxMetzger commented 2 years ago

closed via #739