filippocastelli / pyometiff

OME-TIFF IO in python
GNU General Public License v3.0
18 stars 4 forks source link

OME-TIFF compatible RGB images - metadata issues #5

Open arkwave opened 2 years ago

arkwave commented 2 years ago

I have an RGB image with shape (15840, 28800, 3) (i.e. axis order YXS) - I'm attempting to generate the ome-xml for this file by passing in the following parameters - note that this is after reshaping the array to be in dimension order SYX (as anything else raises an error with OMETIFFWriter):

metadata = {
    'Pixels BigEndian': False, 
    'DimensionOrdering': 'SYX', 
    'SizeY': 15840, 
    'SizeX': 28800, 
    'SizeC': 1, 
    'SizeZ': 1, 
    'SizeT': 1, 
    'PlaneCount': 1, 
    'PhysicalSizeX': 4.991469905942555e-05, 
    'PhysicalSizeXUnit': 'cm', 
    'PhysicalSizeY': 4.991469905942555e-05, 
    'PhysicalSizeYUnit': 'cm', 
    'Name': 'WholeSlideHnE', 
    'Type': 'uint8', 
    'Interleaved': True, 
    'Channels': {
        'RGB': {
            'SamplesPerPixel': 3, 
            'BitsPerSample': (8, 8, 8)
        }
    }
}

params = {
    "fpath"             : destination,
    "dimension_order"   : metadata['DimensionOrdering'],
    "array"             : pixels, 
    "metadata"          : metadata, 
    "explicit_tiffdata" : False,
    "photometric"       : "RGB"
}

xml = OMETIFFWriter(**params)._xml 

This outputs the following XML:

<OME xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
       xsi:schemaLocation="http://www.openmicroscopy.org/Schemas/OME/2016-06 http://www.openmicroscopy.org/Schemas/OME/2016-06/ome.xsd">
    <Image ID="Image:0" Name="WholeSlideHnE">
            <AcquisitionDate>2022-07-08T16:01:50.130775</AcquisitionDate>
                    <Pixels BigEndian="false" 
                            DimensionOrder="XYS" 
                            ID="Pixels:0" 
                            Interleaved="false" 
                            PhysicalSizeX="4.991469905942555e-05" 
                            PhysicalSizeXUnit="cm" 
                            PhysicalSizeY="4.991469905942555e-05" 
                            PhysicalSizeYUnit="cm" 
                            SizeC="1" 
                            SizeT="1" 
                            SizeX="3" 
                            SizeY="1" 
                            SizeZ="1" 
                            Type="uint8">
            <Channel ID="Channel:00" Name="RGB" SamplesPerPixel="3">
                <LightPath/>
          </Channel>
        <TiffData IFD="0" PlaneCount="1"/></Pixels>
            </Image>\n</OME>

Questions:

  1. Why does the dimension order flip from SYX to XYS?
  2. Why are the SizeX, SizeY and Interleaved parameters not properly parsed?
  3. Is Interleaved = True even a correct value to be passing in when trying to write out an RGB image as one single channel? I understand it affects the order in which the bytes representing each pixel are stored in memory, but I'm unclear as to whether or not this makes any other difference (say with trying to load the written image into viewers, etc).
  4. Am I doing something wrong in general when trying to generate the metadata for an OME-TIFF compliant RGB file?

Any help would be much appreciated. Thank you!

EDIT:

It seems like it might not actually be possible to generate OME-XML compliant metadata for RGB images with this package - as seen here, the OMETIFFWriter object expects the Channels key within the passed in metadata dictionary to contain as many keys as there are channels (inferred from the shape of the image) - however, in the case of RGB images, we only want 1 channel with a SamplesPerPixel value of 3, rather than wanting each channel to be parsed as its own channel with SamplesPerPixel = 1.

filippocastelli commented 2 years ago

I'm very sorry for the delayed reply, I just couldn't allocate time earlier this week. I see the structural issue here, it's just that in the usecase for which I've originally made pyometiff I've never encountered this specific need (always had one channel -> one sample) and missed the generalization opportunity, thank you for bringing it up! I don't think it's unfixable, it just needs some restructuring on how Channels is parsed and how the image dimensionality is handled but I think it's doable. I'll put some work on it asap, hope this issue doesn't delay your work too much.

arkwave commented 2 years ago

No worries at all, there's no rush! I managed to find a workaround for my own use case. I just think this is a great package that integrates a lot of nice features, so figured I'd point it out.

filippocastelli commented 2 years ago

Trying to answer quickly to your questions. This is not a definitive answer, more will come.

1. Why does the dimension order flip from SYX to XYS?

pyometiff adjust the available array dimensions in OMETIFFWriter._adjust_dims() to match a specific order, this might be limiting for some.

2. Why are the `SizeX`, `SizeY` and `Interleaved` parameters not properly parsed?

SizeX SIzeY, SizeZ, SizeT and SizeC are not parsed from metadata but are inferred from the image dimensions and the dimension ordering.

3. Is `Interleaved = True` even a correct value to be passing in when trying to write out an RGB image as one single channel? I understand it affects the order in which the bytes representing each pixel are stored in memory, but I'm unclear as to whether or not this makes any other difference (say with trying to load the written image into viewers, etc).

As for now, pyometiff just defaults Interleaved to True, this is because I didn't yet implement an interleaved output mode. This should be easy to implement and will come somewhere in the future, it's just a matter of parsing the option and passing it as a valid planarconfig argument to tifffile.TiffWriter (which is what's being used under the hood).

4. Am I doing something wrong in general when trying to generate the metadata for an OME-TIFF compliant RGB file?

I probably need to better document what values are validly parsed and what are not from metadata

pyometiff is still very young and in the 0.x phase, it's not a big project and won't probably see wide adoption for a while. Generalization takes a lot of time and effort that I'm unfortunately not able to provide upfront, so for now I'm just trying to improve it incrementally as the usecases emerge. Your issues and feedback are very useful to this project and are contributing to its development, thank you