Legacy STK Support - Githubissues

jeskowagner commented 1 year ago

Hi!

Thanks for making this package. I'd love to turn my HTS ImageXpress data into ome-zarr and was hoping faim-hcs could help with that. While running through the tutorial I unfortunately encountered issues at the get_well_image_CYX step. Specifically, it appears that my images are not a metaseries as required by load_metaseries_tiff.

I checked and my data format instead appears to be STK (although the files themselves still have the .tif suffix). Indeed, using tifffile.TiffFile(<myfile>).stk_metadata I can read my metadata just fine.

If I understand correctly, Molecular Devices seems to say that STK is now legacy, and that newer versions should be using the Meta Series format. But unfortunately I am not the creator of these images and would prefer not having to mess with the metadata just to transform the data to ome-zarr.

Are there any plans to allow extraction of metadata from formats other than metadata series?

Thanks! Best,

Jesko

tibuch commented 1 year ago

Hi @jeskowagner,

We have not planned this, but I am happy to help you adding support for legacy STK.

I think the easiest would be to implement a function load_legacy_metaseries_tiff which works with the stk files. If this function would return the image data and the same metadata fields as load_metaseries_tiff we should be able to create ome-zarr files from legacy data as well.

Essentially each of these keys:

        selected_keys = [
            "_IllumSetting_",
            "spatial-calibration-x",
            "spatial-calibration-y",
            "spatial-calibration-units",
            "stage-position-x",
            "stage-position-y",
            "z-position",
            "_MagNA_",
            "_MagSetting_",
            "Exposure Time",
            "Lumencor Cyan Intensity",
            "Lumencor Green Intensity",
            "Lumencor Red Intensity",
            "Lumencor Violet Intensity",
            "Lumencor Yellow Intensity",
            "ShadingCorrection",
            "stage-label",
            "SiteX",
            "SiteY",
            "wavelength",
            "Z Step",  # optional
            "Z Projection Method",  # optional
            "Z Projection Step Size",  # optional
        ]

should be returned with the corresponding value from the stk metadata.

jeskowagner commented 1 year ago

Hi @tibuch,

Thanks for your help! I just got started on a PR, but am unsure which keys of STK metadata would map to the right keys in the metaseries format. Below is the type of metadata I get from STKs.

STK metadata

```javascript // Note that some information was omitted // because the data is not yet public {'PlaneDescriptions': { 'Plate Name: omitted', 'Folder Name: omitted', 'Barcode:omitted', 'omitted', 'Exposure: 200 ms', 'Binning: 1 x 1', 'Region: 2160 x 2160, offset at (200, 0)', 'Acquired from PCO.SDK Camera', 'Subtract: Off', 'Shading: Off', 'Digitizer: 95.33 MHz', 'Gain: Gain 1 (0.45 e/cnt)', 'Spot Noise Reducer Enabled: Yes', 'Frames to Average: 1', 'Trigger Mode: Normal (TIMED)', 'Temperature: 20.3', 'Deconvolution NA: 0.45', 'Deconvolution RI: 1', 'Deconvolution Emissive Wavelength: 525', 'Deconvolution X Image Spacing: 0.325', 'Deconvolution Y Image Spacing: 0.325', 'Deconvolution Spherical Aberration: 0', 'Deconvolution Wiener Filter KValue: 0.141589' }, 'NumberPlanes': 1, 'AutoScale': 1, 'MinScale': 0, 'MaxScale': 62577, 'SpatialCalibration': 1, 'XCalibration': 0.325, 'YCalibration': 0.325, 'CalibrationUnits': 'um', 'Name': 'FITC', 'ThreshState': 0, 'ThreshStateRed': 255, 'ThreshStateGreen': 128, 'ThreshStateBlue': 64, 'ThreshStateLo': 0, 'ThreshStateHi': 65535, 'Zoom': 33, 'CreateTime': datetime.datetime(2019, 9, 22, 0, 8, 49, 406), 'LastSavedTime': datetime.datetime(2019, 9, 22, 0, 8, 50, 160), '_grayFit': 4, '_grayPointCount': 0, 'grayX': 4294967295.0, 'grayY': 4294967295.0, 'grayMin': 4294967295.0, 'grayMax': 4294967295.0, 'StandardLUT': 0, 'wavelength': 525, 'AutoScaleLoInfo': 0.0, 'AutoScaleHiInfo': 0.0, 'Gamma': None, 'CameraBin': None, 'NewLUT': 0, 'PlaneProperty': 9349986, '_TagId67': 9350047} ```

I have not yet seen which metaseries keys contain which values. But just going by key names, I would presume the following mapping:

Metadata key mappings

```javascript { "_IllumSetting_" : "???", "spatial-calibration-x": "XCalibration", "spatial-calibration-y": "YCalibration", "spatial-calibration-units": "CalibrationUnits", "stage-position-x": "???", "stage-position-y": "???", "z-position": "???", "_MagNA_": "Zoom", "_MagSetting_": "???", "Exposure Time", "Lumencor Cyan Intensity": "ThreshStateBlue", "Lumencor Green Intensity": "ThreshStateGreen", "Lumencor Red Intensity": "ThreshStateRed", "Lumencor Violet Intensity": "???", "Lumencor Yellow Intensity": "???", "ShadingCorrection": "Shading", "stage-label": "???", "SiteX": "???", "SiteY": "???", "wavelength": "wavelength", "Z Step": "???", // optional "Z Projection Method": "???", // optional "Z Projection Step Size": "???", // optional } ```

I am not sure about some of those mappings, and for some others I don't think there is any valid mapping (marked ???). Would you be able to provide example values for the keys you request in load_metaseries_tiff? That would help me get the mapping right (or at least improve it).

Thanks! Best,

Jesko

tibuch commented 1 year ago

Cool!

These would be the fields from file resources/Projection-Mix/2023-02-21/1334/Projection-Mix_E07_s1_w1E94C24BD-45E4-450A-9919-257C714278F7.tif:

{'_IllumSetting_': 'FITC_05', # Channel name set by the user in the acquisition software.
 'spatial-calibration-x': 1.3668,
 'spatial-calibration-y': 1.3668,
 'spatial-calibration-units': 'um',
 'stage-position-x': 79813.4, # Used to montage/stitch the images. 
 'stage-position-y': 41385.4, # Used to montage/stitch the images
 'z-position': 9343.19, # Used to montage/stitch the images
 '_MagNA_': 0.75,
 '_MagSetting_': '20X Plan Apo Lambda',
 'Exposure Time': '15 ms',
 'Lumencor Cyan Intensity': 5.09804,
 'Lumencor Green Intensity': 0.0,
 'Lumencor Red Intensity': 0.0,
 'Lumencor Violet Intensity': 0.0, # If missing, I would just set them to zero. 
 'Lumencor Yellow Intensity': 0.0,
 'ShadingCorrection': 'Off',
 'stage-label': 'E07 : Site 1',
 'SiteX': 1.0,
 'SiteY': 1.0,
 'wavelength': 536.0,
 'Z Projection Method': 'Maximum', # Optional, used in case only projected data is stored.
 'Z Projection Step Size': 5.0,
 'PixelType': 'uint16'}

stage-position-x, stage-postion-y and z-position are used to montage the images together. If they are not part of the metadata we might have to fake them and get them via user-input?

jeskowagner commented 1 year ago

Thanks for your swift response!

stage-position-x, stage-postion-y and z-position will be tricky indeed. Perhaps I can compute them from the row and columns that you extract from the file names, then multiply the row/column index by the number of pixels in the image rows/columns. I don't have Z-stacks at the moment, so would require another mind to think up solutions for that bit.

For the stage-label I could use a similar approach, by just merging well and site names.

The PixelType I should be able to extract using numpy.

Could you please still elaborate on what _MagNA_, SiteX, and SiteY are?

tibuch commented 1 year ago

Do you have only a single field of view per well? Maybe that is why the metadata is missing.

_MagNA_ would be the numerical aperture of the used objective. SiteX and SiteY are the positions of the acquired fields. e.g. (1, 1), (1, 2), (2, 1), (2, 2)

jeskowagner commented 1 year ago

Sadly I have multiple sites per well, so am unsure why that data is missing. Without information by the user I don't see how I could impute the arrangements of sites in a well. Perhaps I will have to create a separate argument for that. Regarding _MagNA_, it appears that this is not at all captured in my STKs. Tricky, but perhaps best to just put 1 or a similar value for now, though that's not a great solution...

tibuch commented 1 year ago

A separate argument sounds reasonable. Maybe the same can be done for _MagNA_. I would not set it to 1, but to nan or N/A.

imagejan commented 1 year ago

@jeskowagner what about this:

'Region: 2160 x 2160, offset at (200, 0)',

Does the offset tell anything about the position of the field within the well?

jeskowagner commented 1 year ago

@imagejan Nice guess, I hadn't spotted that! Sadly, it appears that this value is constant across wells and fields within a well in my dataset.

jeskowagner commented 1 year ago

I have started work on this here. However, I am still undecided on how to best obtain stage positions. Would appreciate help of anyone who works with STKs.

jeskowagner commented 1 year ago

I am not sure I understand why the stage-position is divided by the spatial-calibration here. Could someone shed some light on that please? Still looking to assemble my STKs somehow.

imagejan commented 1 year ago

I am still undecided on how to best obtain stage positions.

If the position isn't present in the file metadata, then you have no chance. Do you maybe have secondary files defining how the tile layout looks, from where you could deduce the position of each tile? Or you could provide a mapping of field index to position at runtime... do you actually know how your fields are laid out per well, and if so, how?

Would appreciate help of anyone who works with STKs.

There are different flavors of stks (i.e. the MetaMorph 1.0 format). For example, we have a bunch of VisiView systems that create nd/stk datasets, but without the context of multi-well plates. Those datasets do have some positional information stored in the tiff tags.

I am not sure I understand why the stage-positionis divided by the spatial-calibration

The stage position metadata usually are in micrometers. The _pixel_pos function converts between calibrated (micrometer) positions and pixel units. We need that in order to assemble the fields into a regular grid.

jeskowagner commented 1 year ago

Thank you @imagejan, that's very helpful!

If the position isn't present in the file metadata, then you have no chance. Do you maybe have secondary files defining how the tile layout looks, from where you could deduce the position of each tile? Or you could provide a mapping of field index to position at runtime... do you actually know how your fields are laid out per well, and if so, how?

I do know my layouts, but would of course like to write generalisable code such that future users can also get the right mapping as easily as possible. I have just pushed a change that guesses the "right" arrangement of sites within wells based on the number of sites imaged, but plan to still allow an argument to be passed through to overwrite that (still have to think of the best way there).

There are different flavors of stks (i.e. the MetaMorph 1.0 format). For example, we have a bunch of VisiView systems that create nd/stk datasets, but without the context of multi-well plates. Those datasets do have some positional information stored in the tiff tags.

Good to know! I might want to be careful not to overwrite any information that is present. Unfortunately I don't have a large variety of images to test against, but suppose this can come as this functionality matures.

The stage position metadata usually are in micrometers. The _pixel_pos function converts between calibrated (micrometer) positions and pixel units. We need that in order to assemble the fields into a regular grid.

Ace, this is exactly the info I was looking for, thank you! I should be able to set my spatial-calibration to 1 in that case, because I will fudge the stage-positions in pixels.

Hope that I am not much work away from having a working prototype.

imagejan commented 1 year ago

Great to know you're making progress, @jeskowagner!

I should be able to set my spatial-calibration to 1 in that case,

Ideally, the spatial calibration values (as well as your stage positions) should reflect real world measurements, so that you can measure the size of objects etc. later.

jeskowagner commented 1 year ago

Closing this for now, because I do not have sufficient STK data to play with and see whether the planned implementation would work. There's no point in implementing a feature if it only works for this single dataset I have. If anyone has more diverse STK data let me know!

fmi-faim / faim-ipa

Legacy STK Support #18