Data input - Githubissues

KaWingLee9 commented 4 years ago

In this pipeline, I am required to input _full_clean.tiff and _full_mask.tiff. After I implemented ImcSegmentationpipeline, I just found _ac.ome.tiff and _Probabilities_mask.tiff. How can I get these files?

nilseling commented 4 years ago

Hey, the naming of the files is quite arbitrary. You can check with ImageJ if your _Probabilities_mask.tiff files contain segmentation masks. They should display individual objects that represent cells and are usually located in the cpout folder after running the pipeline. As for the _full_clean.tiff input - these are multichannel tiff stacks usually located in the tiffs folder after running the segmentation pipeline.

votti commented 4 years ago

Actually the tiffs folder is currently not a standard output of the 'default' ImcSegmentationpipeline configuration but was specific to the output of the Damond et al paper.

If this should be added to the default pipline, I am happy to do so! In this case please add an issue: https://github.com/BodenmillerGroup/ImcSegmentationPipeline

@kelvinlee760948065: currently in the standard output the masks are indeed the _Probabilities_mask.tiff. If you want to save something like _full_clean.tiff this requires you to add a 'SaveImagesmodule to save the 'FullStackFiltered using "Saved file format"='tiff' and "Image bit depth"="16-bit integer" and set the output to 'Default Output folder subfolder' ='tiffs'

nilseling commented 4 years ago

I think it would be a good idea to add the tiff stacks. @kelvinlee760948065 I checked and you can also use the loadImages function to read in .ome.tiff files. You might get a warning that the metadata cannot be read-in but you can ignore this. For my current data I don't need to scale pixel-intensities after reading-in ome.tiff files.

nilseling commented 4 years ago

After chatting with @votti we'd recommend that you save out the .tiff files as Vito suggested and read them in using the loadImages function. Reading in the .ome.tiff files can be done but you will lose information regarding the channel names.

KaWingLee9 commented 4 years ago

Thank you for your reply. When I tried to run the ImcSegmentation process, using the example data 20170905_Fluidigmworkshopfinal_SEAJa.mcd, I found that the output channels are more than the channels provided by panel.csv. For my speculation, the missing ones are those you don't take into consideration. And in /cpout/cell.csv, the columns which contain '_filtered' is the channels that are in panel.csv. Is that right? @votti When I tried to use imctools-1.0.8 (mcd.get_acquisition_channels) to get all the channels and form panel.csv file, does the order of each item correspond to each channel of .ometiff so that I can directly run channelNames(images)<-panel$Protein.Shortname for my analysis? @nilseling

votti commented 4 years ago

Hi!

Actually I was a bit thrown of because you were asking for the _full_clean.tiff files, which in customized pipelines usually would save in a subfolder of the cpout folder in a folder called cpout/tiffs

Nils was right there should be an output folder alongside the cpout folder called tiffs. This contains contains images ending in _full.tiff - which are the images you are looking for. In this folder there are also csv files called _full.csv - these contain the metals in the order that they appear in the _full.tiff, The Metal column in the Panel can be used to map these metal names to Protein.Shortname.

For my speculation, the missing ones are those you don't take into consideration.

The mcd files indeed contain more channels than usually used for analysis (explained more below).

And in /cpout/cell.csv, the columns which contain '_filtered' is the channels that are in panel.csv. Is that right?

I dont think this is quite right: The /cpout/cells.csv contain measurements in the format: Intensity_MeanIntensity_FullStackFiltered_c1, Intensity_MeanIntensity_FullStackFiltered_c2...

To quote from the description from the pipeline description: "The mapping between channel number c1, c2, c3 corresponds to the position in the _full.csv's found in the tiffs folder."

So you can map these measurements to the metal isotope channel, by looking up the metal with the corresponding index in the _full.csv

The 'filtered' in FullStack vs FullStackFiltered here only refers to the fact that we apply a filter to remove outlier pixels before taking measurements for FullStackFiltered - not that any channels have been filtered out.

Does this make sense?

When I tried to use imctools-1.0.8 (mcd.get_acquisition_channels) to get all the channels and form panel.csv file, does the order of each item correspond to each channel of .ometiff so that I can directly run channelNames(images)<-panel$Protein.Shortname for my analysis? @nilseling

The .mcd file contain the channels in the order of how they have been acquired, determined by the Fluidigm software during acquisition. The .ome.tiff is a standardized versions of this raw data and also contains the channels in acquisition order NOT in any order related to the Panel.

In the .ome.tiff the channel names/metal isotope information is stored as an annotation of the image plane as ome.tiff metadata. Unfortunately I am not aware of an R software that could read it.

You can readout the plane information from .ome.tiff via imctools v1 using something like:

from imctools.io.ometiffparser import OmetiffParser

fn_ome = '../NAME.ome.tiff'

imc_acquisition = OmetiffParser.get_imc_acquisition()
print(imc_acquisition.channel_metals)
print(imc_acquisiiton.channel_labels)

Then you could match the channel_metals with the Metal column of your Panel to get the Protein.Shortname in the right order.

But I would rather suggest you to use the _full.tiff and _full.csv instead.

BodenmillerGroup / cytomapper_publication

Data input #1