satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.3k stars 917 forks source link

Running Seurat analyses on MAGIC data #1149

Closed scottgigante closed 5 years ago

scottgigante commented 5 years ago

Hi @mojaveazure , I've received a help request from an Rmagic user who is running Seurat 3.0 with your magic.Seurat S3 method addition to Rmagic and I am not sure how to proceed.

@IrinnaP is running the following script both with and without MAGIC imputation and is running into trouble when including MAGIC in the pipeline. I'm not experienced enough with Seurat to comment so I thought I'd repost this here.

This is running the latest version of both Seurat and Rmagic installed from master.

Any feedback would be most helpful. It might also be beneficial for us to include some FAQs / pointers on how to use MAGIC with Seurat to give a little more guidance to users in future.

Thanks!

``` > library(Rmagic) > library(viridis) > library(ggplot2) > library(phateR) > library(Seurat) > > ctrlhrt.data <- Read10X(data.dir = "~/Desktop//sample8047HRTGFP_unpacked/outs/filtered_feature_bc_matrix") > muthrt.data <- Read10X(data.dir = "~/Desktop//sample8115HRTGFP_unpacked/outs/filtered_feature_bc_matrix") > > colnames(ctrlhrt.data) <- paste(colnames(ctrlhrt.data), "1", sep="-") > colnames(muthrt.data) <- paste(colnames(muthrt.data), "2", sep="-") > > ctrlhrt <- CreateSeuratObject(raw.data = ctrlhrt.data, min.cells = 5, project = "CTRLHRT") > ctrlhrt@meta.data$muthrt <- "CTRL" > ctrlhrt <- FilterCells(ctrlhrt, subset.names = "nGene", low.thresholds = 400, high.thresholds = 4000) > ctrlhrt <- FilterCells(ctrlhrt, subset.names = c("eGFP", "Cdh5"), low.thresholds = c(0, 0)) > ctrlhrt <- NormalizeData(ctrlhrt) Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| > ctrlhrt <- ScaleData(ctrlhrt, display.progress = F) > > muthrt <- CreateSeuratObject(raw.data = muthrt.data, min.cells = 5, project = "MUTHRT") > muthrt@meta.data$muthrt <- "MUT" > muthrt <- FilterCells(muthrt, subset.names = "nGene", low.thresholds = 400, high.thresholds = 4000) > muthrt <- FilterCells(muthrt, subset.names = c("eGFP", "Cdh5"), low.thresholds = c(0, 0)) > muthrt <- NormalizeData(muthrt) Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| > muthrt <- ScaleData(muthrt, display.progress = F) > magicctrlhrt <- magic(ctrlhrt, genes= "all_genes") Calculating MAGIC... Running MAGIC on 3513 cells and 12386 genes. Calculating graph and diffusion operator... Calculating PCA... Calculated PCA in 4.28 seconds. Calculating KNN search... Calculated KNN search in 2.01 seconds. Calculating affinities... Calculated affinities in 3.06 seconds. Calculated graph and diffusion operator in 9.53 seconds. Calculating imputation... Automatically selected t = 17 Calculated imputation in 1.61 seconds. Calculated MAGIC in 11.63 seconds. > magicmuthrt <- magic(muthrt, genes= "all_genes") Calculating MAGIC... Running MAGIC on 988 cells and 11041 genes. Calculating graph and diffusion operator... Calculating PCA... Calculated PCA in 1.49 seconds. Calculating KNN search... Calculated KNN search in 0.16 seconds. Calculating affinities... Calculated affinities in 0.11 seconds. Calculated graph and diffusion operator in 1.79 seconds. Calculating imputation... Automatically selected t = 10 Calculated imputation in 0.17 seconds. Calculated MAGIC in 2.05 seconds. > magicctrlhrt <- FindVariableGenes(magicctrlhrt, do.plot = F) Calculating gene means 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variance to mean ratios 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| > magicmuthrt <- FindVariableGenes(magicmuthrt, do.plot = F) Calculating gene means 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating gene variance to mean ratios 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| > g.1 <- head(rownames(magicctrlhrt@hvg.info), 1000) > g.2 <- head(rownames(magicmuthrt@hvg.info), 1000) > genes.use <- unique(c(g.1, g.2)) > genes.use <- intersect(genes.use, rownames(magicctrlhrt@scale.data)) > genes.use <- intersect(genes.use, rownames(magicmuthrt@scale.data)) > hrt.combined <- RunCCA(magicctrlhrt, magicmuthrt, genes.use = genes.use, num.cc = 30) Running CCA Merging objects Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Scaling data matrix |===========================================================================================================================================================| 100% > MetageneBicorPlot(hrt.combined, grouping.var = "magicmuthrt", dims.eval = 1:30, display.progress = FALSE) Rescaling group 1 Error in cc.loadings[[g]] : subscript out of bounds > hrt.combined <- AlignSubspace(hrt.combined, reduction.type = "cca", grouping.var = "magicmuthrt", dims.align = 1:25) Rescaling group 1 Scaling data matrix |===========================================================================================================================================================| 100% Aligning dimension 1 Error in cc.loadings[[g]] : subscript out of bounds > ```
satijalab commented 5 years ago

Sorry to be so delayed here. I'm not quite sure how to advise, or if it makes sense to run downstream Seurat analysis based on MAGIC-imputed values. There will be extensive dependencies between the MAGIC-imputed values (as each cell borrows information from others nearby), and I worry this might violate the assumption of independent data points for the dimensional reduction techniques that we use.

One option would be to use the graph structure learned by MAGIC, and perform downstream analysis on that. I don't know how to extract that information, but it might be preferable to using the imputed values.

scottgigante commented 5 years ago

Whether or not it violates the assumptions of your methods shouldn't affect whether or not it gives a subscript error ¯_(ツ)_/¯

Anyway, this was resolved by changing grouping.var = "magicmuthrt" to grouping.var = "muthrt". Nothing to do with MAGIC and everything to do with a typo. You may want to improve the error message on that one :)