ome / bioformats

Bio-Formats is a Java library for reading and writing data in life sciences image file formats. It is developed by the Open Microscopy Environment. Bio-Formats is released under the GNU General Public License (GPL); commercial licenses are available from Glencoe Software.
https://www.openmicroscopy.org/bio-formats
GNU General Public License v2.0
377 stars 242 forks source link

Bug: Invalid z position in getIndex #3571

Closed Nicholas-Schaub closed 4 years ago

Nicholas-Schaub commented 4 years ago

Summary I'm receiving an error when trying to directly read an image plane by specifying the z-position, but when I directly set the index I receive no error and can successfully read the plane.

Image Information I'm working with an .ome.tif that only has XYZ data (no channels, timepoints, pyramids, etc). The image is stored using the tile settings (tile width and height are both 1024).

Additional Information All of my work is currently done using python-bioformats using the ImageReader class. The issue traces back to the FormatWriter in loci_tools jar (as discussed below). I believe this is an issue with newer versions of loci_tools.jar, because I've swapped out different versions of the jar and the issue no longer appears. It only seems to happen with newer versions of the jar.

When I get the number of z-positions from the metadata, it correctly returns the number of z-positions. However, when I try to read the image and specify a z-position other than 0, I get an error that traces back to FormatWriter.getIndex that says the z value is out of range. If I manually calculate the index and use setIndex then read the plane, it gives me the expected plane.

mtbc commented 4 years ago

Could you give us a clear, simple example that shows the problem? Go ahead and attach an example image here if you like, link to a script in a gist, etc. We'd be happy to try to figure out what's going on.

Did you manage to pin down which BF version introduces the regression? That may help us to pinpoint the culprit.

Nicholas-Schaub commented 4 years ago

I could give you error traces if that would be helpful. While I have used Bioformats in Java in the past, I mainly interact with it through Python, so I'll be sharing what I do in Python.

The version of loci_tools.jar used in python-bioformats is from Bioformats 5.9.0: https://github.com/CellProfiler/python-bioformats/tree/master/bioformats/jars

For various reasons, I started using loci_tools.jar from Bioformats 6.5.0, and I have a link to the jar I use in my repo. I built a tool we use in our organization for reading/writing all images in a cloud computing platform we are building: https://github.com/Nicholas-Schaub/polus-plugins/tree/bfio/document/utils/polus-bfio-util

To use Bioformats 6.5.0, you have to manually copy the loci_tools.jar into the bioformats Python package after installing with pip install python-bioformats.

Then, the following code should reproduce the error:

import bioformats
import javabridge as jutil

jutil.start_vm(class_path=bioformats.JARS)

file_path = '/path/to/3d/file`

try:
    with bioformats.ImageReader(file_path) as reader:
        # This should work
        I = reader.read(c=0,z=0,t=0,rescale=False)

        # This should fail
        J = reader.read(c=0,z=1,t=0,rescale=False)
finally:
    jutil.kill_vm()

When you do a fresh install of javabridge and python-bioformats (which uses Bioformats 5.9.0), the above code should run without any issues. Once you start using the newer version of loci_tools.jar, the above code should throw an error. I could provide code to show how to select different versions of loci_tools.jar if that would be helpful.

Nicholas-Schaub commented 4 years ago

I did some more homework on this. There seems to have been some change in how most tiffs are read using loci_tools.jar released with version 6.1.1 versus all versions that came after that. I get the same error when i used loci_tools.jar released with Bioformats > 6.1.1, and it doesn't seem to matter if it is standard tiff or .ome.tif.

Doing a little bit of homework, it appears as though there was a change in how tiffs might be processed between 6.1.0 and 6.2.0 in the TiffParser: https://github.com/ome/bioformats/compare/v6.1.0...v6.2.0

I don't know if this could account for the discrepancy between the metadata showing there being more z-positions than TiffReader.getSizeZ shows when using newer versions of Bioformats.

sbesson commented 4 years ago

@Nicholas-Schaub thanks for bisecting the Bio-Formats release where the reading behavior has changed on your side. Are you able to share a sample TIFF file allowing us to reproduce the issue?

Nicholas-Schaub commented 4 years ago

@sbesson Yes. Is it okay to send it to your profile email?

sbesson commented 4 years ago

@Nicholas-Schaub can you try uploading the data to http://qa.openmicroscopy.org.uk/qa/upload/. If that does not work of if the file is >2GB, we can provide you with private FTP details for the transfer.

Nicholas-Schaub commented 4 years ago

@sbesson Done. I uploaded a file with a .tifextension and another file with ome.tif extension. Both are 3d, and both throw the error on newer versions of Bioformats.

sbesson commented 4 years ago

Thanks @Nicholas-Schaub, I have initially validated both files you uploaded to us using two standard Bio-Formats tools: the command-line showinf utility and the Fiji/ImageJ plugin. In both cases, I was able to successfully load the image for all the Z-sections (see screenshots below).

Screen Shot 2020-06-16 at 21 20 20 Screen Shot 2020-06-16 at 21 19 39

The absence of error either means these utilities are failing to reproduce the series of API calls made by the python-bioformats wrapper. I will have a go at recreating the scenario described in https://github.com/ome/bioformats/issues/3571#issuecomment-643281905 so that we can track what exactly leads to the getIndex exception.

Nicholas-Schaub commented 4 years ago

If it would be helpful, I can write a piece of code to reproduce the error using different versions of loci_tools. I also had no issues opening them in ImageJ/Fiji, which made me think this was something in the underlying tool or my code. I have thus far ruled out my code, and swapping out different versions of the jar lead to the error, leading me to conclude it was loci_tools.

sbesson commented 4 years ago

Hi @Nicholas-Schaub, I had a more thorough look at python-bioformats using Bio-Formats 6.5.0 and I think I identified the source of the problem. In a nutshell, the issue is at the level of the python-bioformats wrapper as the incorrect reader is picked by the dataset.

When consumers want to automatically detect the appropriate reader, the Bio-Formats recommendation is to use the loci.formats.ImageReader API. This is the standard paradigm used by ImageJ/Fiji and other consumer and tested against our curated QA repository of all supported file formats for each release of the software.

For python-bioformats, the bioformats.ImageReader class does not use this class but implements its own detection logic. The relevant code is here and the main difference is that IFormatReader.isThisType(name, open) is invoked with open set to false i.e. opening files is disallowed. In the case of your dataset, MikroscanTiffReader (introduced in Bio-Formats 6.2.0 which explains your bissection) is selected by python-bioformats while this reader is rejected by loci.formats.ImageReader.

From the Bio-Formats perspective, we can probably look into the impact of updating MikroscanTiffReader to fail early is open is set to false to fix this usage. However, the divergence between bioformats.ImageReader and loci.image.ImageReader is a deeper issue which will undoubtedly result in other mismatches. It might be worth opening an issue on the python-bioformats repository to ask the CellProfiler team which is maintaining this library about their position and whether there is an API with the same signature as Bio-Formats ImageReader.

Nicholas-Schaub commented 4 years ago

That is awesome. Thank you for the quick response. Now that I know what the issue is, I am sure I can write a workaround for it in our code since I've already had to write workarounds for other aspects of python-bioformats. I will definitely follow up with the python-bioformats people.