satijalab / seurat

R toolkit for single cell genomics
2.24k stars 902 forks source link

How do I analysis findmarkers and DEGs after integration with SCTtrasnform #3839

Closed 0717cyj closed 3 years ago

0717cyj commented 3 years ago

Dear SatijaLAB Hello. I have some question about analysis of DEG (findmarker etc.) after integration with SCT.

I made a seurat object from 3 different data set with method of integration with SCTtranform. (vignettes from Satija lab, and anchor features were 3,000; pancreas.features <- SelectIntegrationFeatures(object.list = pancreas.list, nfeatures = 3000) pancreas.list <- PrepSCTIntegration(object.list = pancreas.list, anchor.features = pancreas.features, verbose = FALSE) )

In this situation, integrated objects contains [["SCT"]]@counts, [["SCT"]]@data, [["SCT"]]

And, if proceeded to clusetering and other DEG analysis, in principle, you recommended that it would be most optimal to perform these calculations directly on the residuals (stored in the slot) (

However, after integration with SCTransform by 3,000 anchor features, [["SCT"]] has only 3000 features. In this situation, Which is the optimal data for analysis of DEG and finding markers in [["SCT"]]@counts, [["SCT"]]@data, or [["SCT"]]

And, to perform DEG analysis with [["SCT"]], What additional work do I need to do? Should I back to the integration, and change the integration anchor features to number of all RNA features in my object?

This is a small but important matter that I have encountered, so I ask question to Satija LAB.

Sincerely regards Yong Jun, Choi

jaisonj708 commented 3 years ago

Most DE methods use either raw counts or normalized data, not scaled data. (You also should not run DE on integrated data.) I would recommend running DE by specifying assay=RNA in FindMarkers, using whichever test.use you prefer. The correct slot will automatically be selected for you.

jgamache014 commented 3 years ago

Hello @0717cyj - I ran into the same question. Based on a previous issue here, I'd recommend using the return.only.var.genes = FALSE argument when running the SCTransform() function. This should increase the number of features in object[["SCT"]] beyond 3,000.