BodenmillerGroup / IMCDataAnalysis

R based workflow for multiplexed imaging data
https://bodenmillergroup.github.io/IMCDataAnalysis/
MIT License
29 stars 12 forks source link

IMC data analysis good practice #5

Closed algebio closed 2 years ago

algebio commented 2 years ago

Hello everyone

I have experience analysing suspension mass cytometry data using CATALYST. Now I'm trying to use the same pipeline to analyse imaging mass cytometry data which, after segmentation, I have transformed and input as .fcs files using the flowCore function read.flowSet(). My problem is that the resolution of my heatmaps is very poor and I find it very difficult to identify cell populations in my clusters. I have compensated the data using imcRtools (thanks again Nils) and CATALYST. I was wondering if there is any good practice guidance in terms of settings of the different functions (prepData, cluster, runDR) or anything else when using imc data?

Some figures attached

Regards Juan

[003.1 plotExprHeatmapMedian.pdf_comp.pdf 05 plotExprHeatmap100.pdf_comp.pdf [05.3 mean.pdf_comp.pdf](https://github.com/BodenmillerGroup/IMCDataAnalysis/files/7603626 05.1 features NULL.pdf_comp.pdf /05.3.mean.pdf_comp.pdf) ] before after compensation (https://github.com/BodenmillerGroup/IMCDataAnalysis/files/7603600/003.1.plotExprHeatmapMedian.pdf_comp.pdf) 01 plotCounts.pdf_unc.pdf meta40 UMAP.pdf_comp.pdf 05 plotExprHeatmap40.pdf_comp.pdf

algebio commented 2 years ago

An example of my doubts to choose the right settings: I use the flowCore function read.flowSet(), should I use default settings for these options: transformation = "linearize", decades = 0, sep = "\t", as.is = TRUE, min.limit = NULL, truncate_max_range = TRUE, emptyValue = TRUE, ignore.text.offset = FALSE.

Regards Juan

nilseling commented 2 years ago

Hi @algebio

  1. for reading in the data have a look at the read_cpout and read_steinbock functions. They will read in the single-cell data generated by CellProfiler or steinbock to generate a SpatialExperiment or SingleCellExperiment object. I'm not familiar with the default parameters of the flowCore package.
  2. When transforming the data we tend to use a smaller co-factor compared to CyTOF data. So usually assay(sce, "exprs") <- asinh(counts(sce)) is sufficient for transformation.
  3. The UMAP looks fine to me. Due to lateral spillover arising from segmentation issues you expect lower resolution in terms of clusters compared to CyTOF data.
  4. Have a look at different clustering approaches (e.g. Rphenograph, shared-nearest neighbour clustering (or other techniques) using the bluster package). It seems like you have overclustered the data quite a bit and you will need to merge clusters. An alternative is also to first gate and then classify cells as we have done here and then sub-cluster the major cell types.

For a general overview on common spatial analysis approaches for IMC data have a look at our recent preprint.

As with any exploratory data analysis routines, there's quite some trial-and-error involved when analysing IMC data. So check out different tools and compare the results to what you would expect from the literature and what you see in the images. Cheers,

Nils

algebio commented 2 years ago

Hi Nils Sorry for late reply, I was several days knocked out by a gastric virus. Thank you for all your advices, this will help me a lot! Regards Juan