ome / bioformats

Bio-Formats is a Java library for reading and writing data in life sciences image file formats. It is developed by the Open Microscopy Environment. Bio-Formats is released under the GNU General Public License (GPL); commercial licenses are available from Glencoe Software.
https://www.openmicroscopy.org/bio-formats
GNU General Public License v2.0
380 stars 241 forks source link

bioformats misinterprets some nd files as HCS data if the stage labels match [a-z]\d+ #4093

Open anntzer opened 1 year ago

anntzer commented 1 year ago

Consider the test nd dataset provided at https://github.com/ome/bioformats/issues/4069#issue-1839504482. Fiji's bio-formats importer opens it correctly, and inspecting it e.g. with showinf indicates Series count = 2 and Image count = 4 on both series, as expected. Yet, if one opens test.nd and directly edits the position names from "position_a" and "position_b" to e.g. "f1" and "f2", then the dataset is no longer loaded correctly by Fiji (in particular it seems that some of the tiff files just don't show up at all in the final dataset), and showinf reports Series count = 2 but Image count = 1, together with a "RGB mismatch" warning. As far as I can see, this occurs because the position names match WELL_COORDS ("\\b([a-z])(\\d+)"), ultimately leading to taking the isHCS = true branch. I can understand that some degree of guessing based on filenames and position names is needed, but I would guess that the isHCS = true branch could be skipped if 1) it leads to ignoring timepoints that are specified by the NTimePoints entry in the .nd file (and that are actually present), and/or 2) if it leads to RGB mismatched data (whereas the non-HCS interpretation is just fine).

dgault commented 1 year ago

Thank you @anntzer for opening the issue and investigating the root cause, that is extremely helpful. This is perhaps something we can look at addressing alongside https://github.com/ome/bioformats/issues/4069 at the same time