Open dagarfield opened 4 years ago
@dagarfield Did you manage to solve the problem? I am facing the same issue as well over here... I guess if no plausible solution is available, then constructing an appropriate reduced dimension object using FastMNN would be the only option.
In the end, I went to FastMNN itself (as you suggest) and constructed the object directly rather than through the Seurat wrapper. It was a bit annoying, but worked well enough in the end, and the FastMNN documentation is pretty good.
@dagarfield Could you please kindly provide me your steps in constructing the proper object? I tried to do so but I still could not project my MNN dimensions. Here is my code on how I did MNN correction then convert to Seurat object:
so <- readRDS(file = paste0(output, "/PBMC/SO_merge.Rds"))
### Create SingleCellExperiment object
sce <- as.SingleCellExperiment(so)
rowData(sce) <- NULL
reducedDim(sce) <- NULL
reducedDim(sce, type = "UMAP") <- NULL
### Correct by sample ID
s11 <- sce[ , grepl("S11", sce$orig.ident)]
s12 <- sce[ , grepl("S12", sce$orig.ident)]
s13 <- sce[ , grepl("S13", sce$orig.ident)]
s14 <- sce[ , grepl("S14", sce$orig.ident)]
s15 <- sce[ , grepl("S15", sce$orig.ident)]
s16 <- sce[ , grepl("S16", sce$orig.ident)]
s18 <- sce[ , grepl("S18", sce$orig.ident)]
s19 <- sce[ , grepl("S19", sce$orig.ident)]
s20 <- sce[ , grepl("S20", sce$orig.ident)]
s21 <- sce[ , grepl("S21", sce$orig.ident)]
s22 <- sce[ , grepl("S22", sce$orig.ident)]
s23 <- sce[ , grepl("S23", sce$orig.ident)]
s24 <- sce[ , grepl("S24", sce$orig.ident)]
s25 <- sce[ , grepl("S25", sce$orig.ident)]
s26 <- sce[ , grepl("S26", sce$orig.ident)]
s27 <- sce[ , grepl("S27", sce$orig.ident)]
s28 <- sce[ , grepl("S28", sce$orig.ident)]
all.sce <- list(S11 = s11, S12 = s12, S13 = s13, S14 = s14, S15 = s15, S16 = s16,
S18 = s18, S19 = s19, S20 = s20, S21 = s21, S22 = s22, S23 = s23,
S24 = s24, S25 = s25, S26 = s26, S27 = s27, S28 = s28)
### Subset all batches to common universe of genes
universe <- Reduce(intersect, lapply(all.sce, rownames))
all.sce <- lapply(all.sce, "[", i = universe,)
### Adjust scaling to equalize sequencing coverage
normed.sce <- do.call(multiBatchNorm, all.sce)
### Find highly variable genes
all.var <- lapply(all.sce, modelGeneVar)
combined.var <- do.call(combineVar, all.var)
hvg.list <- rownames(combined.var)[combined.var$bio > 0]
### Correct batch effect
set.seed(920101)
mnn.sce <- do.call(fastMNN, c(normed.sce, list(subset.row = hvg.list)))
### Save computed MNN into SCE object, then convert to Seurat object
reducedDim(sce, "MNN") <- reducedDim(mnn.sce, "corrected")
so.fastmnn <- as.Seurat(sce)
Could you guide me on where I did wrong? Thank you very much!
Brief update... I managed to solve the issue, although I'm not sure if this is the proper way.
The problem with ProjectDim
is that it calls the data from the scale.data
slot to be used for projection. However, the merged, MNN-corrected Seurat object does not have the scaled data nor variable features as mentioned in #15 .
Therefore, I saved the highly variable genes list used for MNN into the variable features slot in the Seurat object, then scaled the data. After that, I was able to project the loadings. My code is as below.
### Continue from above
so.fastmnn <- as.Seurat(sce)
### Keep highly variable genes list into Seurat object
so.fastmnn@assays$RNA@var.features <- hvg.list
### Scale data & project loadings
so.fastmnn <- ScaleData(so.fastmnn)
ProjectDim(so.fastmnn, reduction = "mnn", dims.print = 1:2, nfeatures.print = 5)
My results as below:
mnn_ 1
Positive: NKG7, GNLY, GZMB, FGFBP2, CST7
Negative: RPL32, RPL13, RPS8, RPS12, RPL39
mnn_ 2
Positive: COTL1, TRBV5-1, NSMCE1, HLA-DRB5, SAT1
Negative: CD7, NKG7, CCL5, FGFBP2, GZMB
An object of class Seurat
15572 features across 93495 samples within 1 assay
Active assay: RNA (15572 features, 13326 variable features)
1 dimensional reduction calculated: mnn
These steps seems logical to me but I hope someone could clarify if what I did is indeed correct.
@dagarfield, did you do something similar? Could you share how you solved the issue?
Seurat developers, do my steps seems logical?
Thank you very much!
I've been using RunFastMNN to align partially overlapping datasets. It works great in this context, but I run into an issue not in downstream analyses, but in downstream presentations like heat maps and other exploratory plots discussed here.
But integrated objects following this approach seem to work fine:
Any guesses where to look? It is, of course, possible to go directly to fastMNN and to construct the appropriate reduced dimensionality object. But it would be nice to use RunFastMNN....and I feel like I'm probably missing something obvious about the dimensionality of what's stored in the output object of RunFastMNN.
Thanks