Closed pbschwar closed 4 years ago
As an update, I had initially kept the conditions separate prior to merging and integration to be able to individually evaluate. If I merged and created a list prior to running SCTransform this seemed to have fixed whatever error occurred. Is creation of the list necessary prior to running SCTransform when you intend to integrate the datasets?
Thanks!
Hi,
You are right. In the SCTransform integration vignette, we split merged object into a list, and run SCTransform for each individual object in the list.
When you run SCTransform
, it will store the sct modal in the object. Then, when you merge those SCTransform normalized object, the merged object has a list of models stored. However, next, when you use SplitObject
for the merged object, the sct modals may not be split correctly.
@pbschwar - Did you mean you perform SCTranform taking a list of 2 objects that you each SCTransform as input?
seuratNC <- SCTransform(seuratNC, vars.to.regress = c("percent.mt","CC.Difference"), verbose = T)
This error happended because the some genes can not be found in the scale.data, which are done using "SCTransform". The default setting of the feature number is 3000 in "SCTransform" function as shown below. You should change "variable.features.n" to the total number of gene features.
## Default S3 method: SCTransform( object, cell.attr, reference.SCT.model = NULL, do.correct.umi = TRUE, ncells = 5000, residual.features = NULL, **variable.features.n = 3000,** variable.features.rv.th = 1.3, vars.to.regress = NULL, do.scale = FALSE, do.center = TRUE, clip.range = c(-sqrt(x = ncol(x = umi)/30), sqrt(x = ncol(x = umi)/30)), vst.flavor = "v2", conserve.memory = FALSE, return.only.var.genes = TRUE, seed.use = 1448145, verbose = TRUE, ... )
You can revise your script to the following and try agian. Do the same revision on seuratCC as well
seuratNC <- SCTransform(seuratNC, vars.to.regress = c("percent.mt","CC.Difference"), variable.features.n =seuratNC ,verbose = T)
Hope this will work. However, I tried this is not enough to avoid the error. The following is my script using my data "dtlist" is a list of multiple seurat object. Before the bug being corrected, you can using the following code to get the correct "Anchor.features" that mapping to the gene feature after "GetResidual" function.
`##find integration gene features intfts <- SelectIntegrationFeatures(object.list = dtlist, nfeatures = nrow(sce)) #nrow(sce) 10549 length(c) #10499
getAnchor.features=function(obj){
obj <- GetResidual(object = obj, assay = assay[i], features = anchor.features, replace.value = ifelse(test = is.null(x = sct.clip.range), yes = FALSE, no = TRUE), clip.range = sct.clip.range, verbose = FALSE)
scale.data <- GetAssayData(object = obj, assay = "SCT", slot = "scale.data")
row.names(scale.data) }
c=lapply(dtlist,getAnchor.features) d=Reduce(intersect, c(c,list(intfts=intfts))) length(d) #10449
dtlist <- PrepSCTIntegration(object.list = dtlist, anchor.features = d )`
Hello Seurat Team,
Fairly new to this, so I apologize if this has been addressed before. Also, thank you for such a good program. I had originally integrated two datasets (2 different conditions) using the standard workflow, and there were no issues. I am now attempting to integrate them following SCTransform and I get the following error while running the PrepSCTIntegration function:
"Error in scale.data[anchor.features, ] : subscript out of bounds"
My workflow thus far is:
Code below: seuratNC <- CreateSeuratObject(counts = countsNC, project = "NC KC", assay = "RNA",min.cells = 0, min.features = 200) seuratCD <- CreateSeuratObject(countsCD, project = "CD KC", assay = "RNA", min.cells = 0, min.features = 200) seuratNC[["percent.mt"]] <- PercentageFeatureSet(seuratNC, pattern = "^mt-") seuratCD[["percent.mt"]] <- PercentageFeatureSet(seuratCD, pattern = "^mt-")
Remove doublets
seuratNC <- subset(seuratNC, subset = nFeature_RNA < 6000) seuratCD <- subset(seuratCD, subset = nFeature_RNA < 6000) seuratNC <- NormalizeData(object = seuratNC) seuratNC <- CellCycleScoring(seuratNC , s.features = s.genes, g2m.features = g2m.genes, set.ident = TRUE) seuratNC$CC.Difference <- seuratNC$S.Score - seuratNC$G2M.Score seuratNC <- SCTransform(seuratNC, vars.to.regress = c("percent.mt","CC.Difference"), verbose = T) seuratCD <- NormalizeData(object = seuratCD) seuratCD <- CellCycleScoring(seuratCD, s.features = s.genes, g2m.features = g2m.genes, set.ident = TRUE) seuratCD$CC.Difference <- seuratCD$S.Score - seuratCD$G2M.Score seuratCD <- SCTransform(seuratCD, vars.to.regress = c("percent.mt","CC.Difference"), verbose = T) kc.combined <- merge(seuratNC_umap, y = seuratCD_umap, add.cell.ids = c("NC", "CD"), project = "KC C") DefaultAssay(kc.combined) <- "SCT" kc.list <- SplitObject(kc.combined, split.by = "orig.ident") integ_features <- SelectIntegrationFeatures(kc.list, nfeatures = 3000) kc.list <- PrepSCTIntegration(object.list = kc.list ,anchor.features = integ_features, verbose=T, assay = "SCT")
Thanks for your help