CellProfiler / python-bioformats

Read and write life sciences file formats
Other
129 stars 45 forks source link

Incorrect namespaces detection leads to incomplete/ incorrect metadata readout #78

Open FannyGeorgi opened 7 years ago

FannyGeorgi commented 7 years ago

Hi everyone!

First of all: Great and very much appreciated bioformats wrapper!

I encountered the following issue:

My aim is to filter my images based on the channel. The images were acquired on a Molecular Devices microscope and the channel information is hidden in the OME xml in Pixels.Channel.Name = "TL20". Using bioformats in Fiji works perfectly fine and uses the 2016-6 namespaces.

When I use python-bioformats as follows import javabridge import bioformats path = r"path\\to\\20160130-corning-all-spheroids-p2-095hps_A01_w1.TIF" omeXml = bioformats.get_omexml_metadata(path) the information displayed is partially wrong (PhysicalSize) and incomplete (Channel Name is missing). I noticed that in the python-bioformats xml namespaces 2015-01 is used. I tried to track down where things are going wrong and found that in omexml.py the default namespace is 2013-06, which then is replaced by the top-level namespaces in get_namespaces. I tried downloading the .xsd file directly from https://www.openmicroscopy.org/Schemas/OME/2016-06/ and to manually associate the schema, but also failed to display the correct information.

Is there a way to manually correct the namespaces in python-bioformats? Am I actually on the right track?

For some reason I cannot upload a .zip with .tifs and the .xmls here, I've uploaded it here: https://github.com/FannyGeorgi/SampleData And some environment information: I am using python 2.7, javabridge from http://www.lfd.uci.edu/%7Egohlke/pythonlibs/#javabridge and bioformats 0.1.14 with JRE 1.8.0_131 in Windows 10.

Thanks for your help! Fanny

mcquin commented 7 years ago

Hey @FannyGeorgi ,

I'm working on updating the XML in #83 . Once that PR is merged, would you mind giving it a try? I hope it'll solve the issue you're having.

ChenQianAZ commented 2 years ago

@mcquin I came across the same issue. The file format is imagexpress data as Fanny saved in the SampleData folder. Channel Name is missing when using python-bioformats to retrieve the metadata.

I am using python 3.8 and python-bioformats 4.0.4, so this updates doesn't seem to fix it.

When using meta = bioformats.get_omexml_metadata(path=f1) to read the embedded XML in the tiff file, the Channel Name is supposed to be between Channel ID and SamplesPerPixel, but it's not there.

Thanks for your help!