satijalab / seurat-wrappers

Community-provided extensions to Seurat
GNU General Public License v3.0
307 stars 131 forks source link

Fixed bug with running scVI integration over SCTransformed data #184

Open GreenGilad opened 8 months ago

GreenGilad commented 8 months ago

As described at the bottom of this issue , when running IntegrateLaters over an SCT assay with the scVIIntegration method, we get the following error:

Error in UseMethod(generic = "JoinLayers", object = object) : 
  no applicable method for 'JoinLayers' applied to an object of class "c('SCTAssay', 'Assay', 'KeyMixin')"

This error happens when calling JoinLayers (line 93). The purpose of this line is to join the different (count) layers so to create a single counts matrix to pass to AnnData (line 98, calling LayerData). While the call to JoinLayers is indeed needed for a standard assay (and thus scVIIntegration works over RNA assay), it is not needed for an SCTAssay as calling LayerData with such an assay pulls the information from all nested SCTModels. The proposed solution is therefore to call JoinLayers only or standard assays:

if (inherits(x = object, what = 'StdAssay')) {
  object <- JoinLayers(object = object, layers = "counts")
}


One additional change regards the downstream use of the scVI integrated reduction object. When calling FindClusters over an SCTransform + scVIIntegration + FindNeighbors object, it states that there is no graph with the passed name (i.e. by default "SCT_snn"). That is because the newly created dimensionality reduction object has an assay.used of "RNA". As such the saved graph name is then "RNA_snn", while FindClusters is looking for "SCT_snn". To fix that I've:


Code for testing:

library(Seurat)
library(SeuratWrappers)
library(dplyr)

# Testing that scVIIntegration over RNA assay still works
obj <- LoadSeuratRds("seurat.object.rds") # "Fresh" object with only a count matrix
obj[["RNA"]] <- split(obj[["RNA"]], obj$orig.ident)
obj <- NormalizeData(obj, verbose=F) %>% FindVariableFeatures(verbose=F) %>% ScaleData(verbose=F) %>% RunPCA(verbose = F)
obj <- IntegrateLayers(
  object = obj,
  method = scVIIntegration,
  conda_env = "conda/env/path",
  verbose = TRUE
)

# Testing that fix actually fixed the problem
obj <- LoadSeuratRds("seurat.object.rds") # "Fresh" object with only a count matrix
obj[["RNA"]] <- split(obj[["RNA"]], obj$orig.ident)
obj <- SCTransform(obj, verbose=F) %>% RunPCA(verbose = F)
obj <- IntegrateLayers(
  object = obj,
  method = scVIIntegration,
  new.reduction = "integrated.scVI",
  conda_env = "conda/env/path",
  verbose = TRUE
)

# Verify that downstream clustering over integrated embedding works
obj <- FindNeighbors(obj, reduction = "integrated.scVI", dims = 1:30, verbose=F) %>%
    FindClusters(resolution = 3, verbose=F) %>%
    RunUMAP(reduction = "integrated.scVI", dims = 1:30, reduction.name = "umap.scVI", verbose=F)