OME-TIFF Reader overwriting IDS

dgault commented 3 years ago

Issue was raised on imagesc thread: https://forum.image.sc/t/different-custom-metadata-parsing-bioformat-using-fiji-knime-command-line/51704

A sample file has been provided but can also be easily reproduced by changing IDs in existing sample files. Tested with version 6.6.1.

The issue appears to be the call to MetadataTools.populatePixels within the OMETiffReader which is overwriting the existing ID's set on the metdatastore.

imagesc-bot commented 3 years ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/different-custom-metadata-parsing-bioformat-using-fiji-knime-command-line/51704/2

sbesson commented 2 years ago

This issue is also a duplicate of the problem originally described in https://trac.openmicroscopy.org/ome/ticket/13270.

From a recent investigation, this overwriting behavior applies to all readers reading OME-XMLand converting it into the Metadata API incl. OMEXMLReader. The MetadataTools helper methods overwrite the IDs of three elements:

One implication is that the IDs of the generated OME-XML are always compliant and follow the convention of being named after the series ID. As demonstrated in the Trac ticket linked above, if the element IDs are used elsewhere e.g. as ImageRef in a Plate/Well/WellSample hierarchy, the references are not updated and this creates a broken metadata representation.

From a former discussion with the @ome/formats team, @melissalinkert raised the point that overwriting IDs might still be necessary in some examples of our curated repository e.g. when the original XML is invalid and the current reader behavior attempts to correct some of these issues. Unilaterally removing this behavior might also cause regression with existing data and make it unreadable.

As an intermediate solution, my current proposal would be to make the decision based on the validity of the original OME-XML as follows:

parse the OME-XML as done currently in the reader and validate it
if the OME-XML is invalid, use the existing MetadataTools implementation
if the OME-XML is valid, add a new backwards compatible API to MetadataTools allowing to skips the ID overwriting and use the existing values stored using MetadataConverter instead

As a starting point, https://github.com/sbesson/bioformats/commit/88746ed2b8e2b1ded25d5f2ff52cca6e39a62f9e adds a few unit tests capturing the current behavior.

ome / bioformats

OME-TIFF Reader overwriting IDS #3685