CellProfiler / python-bioformats

Read and write life sciences file formats
Other
129 stars 45 forks source link

Usage Question on retrieveing relevant metadata #10

Open sebi06 opened 10 years ago

sebi06 commented 10 years ago

Hi,

I am just trying to read the metadata of an CZI file using python-bioformats, but I still struggle a bit. I can read most of the information, but not the one I actually need.

Immersion, LensNA, NominalMagnification, PhysicalSizeX, PhysicalSizeX and PysicalSizeZ

Here is part of the python code:

.... imagefile = r'C:\Users\Sebi\CZI_Read\Z-Stack.czi' omexml = bioformats.get_omexml_metadata(imagefile) md = bioformats.OMEXML(omexml) ...

Whatever I tried, I just did not figure out, why I can not find those entries. And when I tried to parse the XML via ElementTree (per hand) I got stuck with invalid tokens ('µm') somewhere.

Can you point me into the right direction?

Cheers, Sebi

Here is the file:

https://dl.dropboxusercontent.com/u/623476/Z-Stack_3slices_dz%3D1.5um.czi


I know that BioFormats itself when used from Fiji gives me:

https://dl.dropboxusercontent.com/u/623476/OMEXML.txt

LeeKamentsky commented 10 years ago

Hey Sebi, Easy fix. It looks like Bio-formats has improved their support for CZI and you can use their new Jar in python-bioformats. I got bioformats_package.jar from here: http://downloads.openmicroscopy.org/bio-formats/5.0.2/artifacts/bioformats_package.jar

I then started the javabridge like this:

import javabridge
javabridge.start_vm(class_path = [<path to bioformats_package.jar>] + javabridge.JARS)

and when I read the OMEXML with bioformats, your fields were there. Hope you can reproduce that. You can close this issue if it works for you.

sebi06 commented 10 years ago

Hi Lee,

I already used 5.0.2 and I can see what you mean, but I guess I am just not smart enough to finally extract what I want. Here is my code:

###################################################################

path = r'C:\Users\M1SRH\Documents\Software\Bio-Formats_Package\5.0.2\bioformats_package.jar' import javabridge import bioformats jars = javabridge.JARS + [path] javabridge.start_vm(class_path=jars)

imagefile = r'C:\Users\Sebi\CZI_Read\Z-Stack_5slices_dz=1.5um.czi' omexml = bioformats.get_omexml_metadata(imagefile) md = bioformats.OMEXML(omexml)

########################################################

So the string omexml contains all I need and it is converted to an XML tree via bioformats.get_ome_xml_metadata() right? I initial guess is, that I can now access what I need via md (see code lines above).

But where or how do I find "PhysicalSizeZ" within the md data structure? I bet it easy for you ... :-)

Cheers, Sebi

LeeKamentsky commented 10 years ago

For the invalid tokens, you can convert the unicode XML to utf-8 like this:

omexml.encode('utf-8')

But I was able to get things programatically via the MetadataStore using the new JWrapper in javabridge 1.0.3:

czi = r"C:\Users\Lee\cpdev\Z-Stack_3slices_dz=1.5um.czi"
rdr = bioformats.get_image_reader(None, path=czi)
#
#  rdr.rdr is the actual bioformats reader. rdr handles its lifetime
#
jmd = javabridge.JWrapper(rdr.rdr.getMetadataStore())
immersion = jmd.getObjectiveImmersion(0,0) # get the first Objective record in the first Instrument record
lensna = jmd.getObjectiveLensNA(0, 0)
magnification = jmd.getObjectiveNominalMagnification(0, 0)
xsize = jmd.getPixelsPhysicalSizeX(0)

docs on jmd (actually - you're using MetadataRetrieve here) can be found here: http://downloads.openmicroscopy.org/bio-formats/5.0.2/api/loci/formats/meta/MetadataRetrieve.html

sebi06 commented 10 years ago

Hi Lee,

thanks a lot. Now it works.

Do you know, why I always get the following error message, if I re-run my program without restarting the Pythin Interpreter:

raise ValueError("Cannot set Java class path in the \"args\" argument to start_vm. Use the class_path keyword argument to javabridge.start_vm instead.") ValueError: Cannot set Java class path in the "args" argument to start_vm. Use the class_path keyword argument to javabridge.start_vm instead.

LeeKamentsky commented 10 years ago

We need to include a javascript .jar file and a Jar file of our own in the class path to get certain threading aspects of the implementation to work (esp on the Mac). We thought it would be safest to force people to use the class_path keyword rather than specifying the class path directly (and potentially not including the jar files we need).

Vebjorn and I thought about this a little and we couldn't come up with any scenario which couldn't be handled by the class_path argument - we'd like to hear from you or others if they have a valid reason for entering the class path directly.

ehrenfeu commented 6 years ago

@sebi06 - it's been quite some time since you opened this issue, but at least regarding this question

But where or how do I find "PhysicalSizeZ" within the md data structure?

there is some progress in #102 (waiting to be merged)