satijalab / sctransform

R package for modeling single cell UMI expression data using regularized negative binomial regression
GNU General Public License v3.0
213 stars 33 forks source link

Integration & Subset #194

Open galaxyeee opened 5 months ago

galaxyeee commented 5 months ago

Hi, Thank you for developing SCTransform.

I have read the several issues, and I want to check if I understand them correctly. I am using public data that integrates data from 10 patients into one dataset. The differences in nCount_RNA among patients seem to affect cell type proportions, so I plan to regress them out.

  1. I understand that if the differences between samples include biological differences, it is better to perform SCTransform after integration. Is this correct? If so, should I use seu_snRNA <- SCTransform(seu_snRNA, vars.to.regress = "nCount_RNA", verbose = FALSE) to regress out the differences, and then perform batch correction using a package like Harmony? Or should I use vars.to.regress = c("orig.ident", "nCount_RNA") instead?

  2. If it is better to perform SCTransform on each sample separately, should I add vars.to.regress = "nCount_RNA" in the following code as shown on https://satijalab.org/seurat/archive/v3.0/integration.html#sctransform?

    for (i in 1:length(pancreas.list)) {
    pancreas.list[[i]] <- SCTransform(pancreas.list[[i]], verbose = FALSE)
    }
  3. Finally, if I subset the data after integration, should I perform SCT again, or can I proceed directly with RunPCA?

Thank you for your assistance.

Best, EUNHA