I have read the several issues, and I want to check if I understand them correctly. I am using public data that integrates data from 10 patients into one dataset. The differences in nCount_RNA among patients seem to affect cell type proportions, so I plan to regress them out.
I understand that if the differences between samples include biological differences, it is better to perform SCTransform after integration. Is this correct?
If so, should I use seu_snRNA <- SCTransform(seu_snRNA, vars.to.regress = "nCount_RNA", verbose = FALSE) to regress out the differences, and then perform batch correction using a package like Harmony? Or should I use vars.to.regress = c("orig.ident", "nCount_RNA") instead?
Hi, Thank you for developing SCTransform.
I have read the several issues, and I want to check if I understand them correctly. I am using public data that integrates data from 10 patients into one dataset. The differences in nCount_RNA among patients seem to affect cell type proportions, so I plan to regress them out.
I understand that if the differences between samples include biological differences, it is better to perform SCTransform after integration. Is this correct? If so, should I use
seu_snRNA <- SCTransform(seu_snRNA, vars.to.regress = "nCount_RNA", verbose = FALSE)
to regress out the differences, and then perform batch correction using a package like Harmony? Or should I usevars.to.regress = c("orig.ident", "nCount_RNA")
instead?If it is better to perform SCTransform on each sample separately, should I add
vars.to.regress = "nCount_RNA"
in the following code as shown on https://satijalab.org/seurat/archive/v3.0/integration.html#sctransform?Finally, if I subset the data after integration, should I perform SCT again, or can I proceed directly with RunPCA?
Thank you for your assistance.
Best, EUNHA