delmic / odemis

Open Delmic Microscope Software
GNU General Public License v2.0
47 stars 39 forks source link

Distinguishing whether a group in hdf5 is image, spectrum etc. #350

Closed thomasaarholt closed 7 years ago

thomasaarholt commented 7 years ago

I'm writing the hyperspy h5 reader for the odemis h5 format now, and I'm trying to distinguish the different types of data present in the groups within a h5 file.

In a typical odemis h5 file, the structure is something like this:

testfile.h5

  1. Acquisition1 (low pixel SE image)
    1. ImageData
      1. Image
      2. ...
    2. PhysicalData
    3. SVIData
  2. Acquisition2 (high-resolution SE image)
    1. ...
  3. Acquisition3 (CL spectrum image (hyperspectral image))
    1. ...

I notice that the "Image" dataset can contain a spectrum or an image, and the datasets are defaulted to have 5 dimensions. For instance 1 x 1 x 1 x 480 x 640 for an image. But for a spectrum image I see something like 8 x 12 x 1 x 1 x 512 512 x 1 x 1 x 8 x 12.

What is the best way to determine what type of data is in each Acquisition-group?

Currently I'm thinking of just checking the shape of the matrix and determining it from that (an image always has the first three entries of the shape = 1).

pieleric commented 7 years ago

Yes, indeed there is no metadata explicitly indicating which type of acquisition each data corresponds to. To distinguish between the data, the recommended way is to use the shape and basic metadata. You can have a look at https://github.com/delmic/odemis/blob/master/src/odemis/util/dataio.py to see how it's done within Odemis.

The shape is (almost always) CTZYX. So for a spectrum cube, you'd expect the first dimension and last two dimensions >1. Something like 8x 1x1x12x512 (I guess that's what you have in your example, and just mistook when copying the shape). You could also double check by looking for the "DimensionScaleC" metadata, and verifying it's the same dimension as the C shape.

thomasaarholt commented 7 years ago

Yes, you're quite right, that was a typo. Thanks!