IDR / bioformats

Bio-Formats is a Java library for reading and writing data in life sciences image file formats. It is developed by the Open Microscopy Environment (particularly UW-Madison LOCI and Glencoe Software). Bio-Formats is released under the GNU General Public License (GPL); commercial licenses are available from Glencoe Software.
http://www.openmicroscopy.org/site/products/bio-formats
GNU General Public License v2.0
0 stars 0 forks source link

Operetta: fix metadata files logic to skip plate folders #22

Closed sbesson closed 4 years ago

sbesson commented 4 years ago

Rebased from https://github.com/ome/bioformats/pull/3583

This commit deals with the scenario of an IDR submission where plate folders containing the TIFF data + the metadata XML files directly (without the usual Images subdirectory) were found under the same current level. The assumptions of the reader currently causes all other plates to be added to the fileset as ancillary metadata files, increasing the size of the original 7K fileset to 55K and slowing down import down the line.

The new logic adds early directory checks when adding metadata files to skip either the directory containing the master XML file or other directories containing XML files.

To be tested in the context of the idr0078 submission

sbesson commented 4 years ago

The daily CI builds have confirmed this does not cause any regression on existing Operetta/Harmony filesets. Merging to test the import workflow on idr0078/screenB