satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.25k stars 904 forks source link

Differential expression after SCTransform #2646

Closed owenwilkins closed 4 years ago

owenwilkins commented 4 years ago

Hello,

I have been running some differential expression analyses using FindMarkers() after performing normalization of scRNA-seq using SCTransform and integration using the Seurat v3 approach, and was hoping someone may be able to provide some guidance on the most appropriate DE test to use (specified by the test.use argument) after the data has been normalized using SCTransform.

While I appreciate this may not have been empirically evaluated extensively yet, any thoughts on specific tests that the developers think may be more appropriate to use on SCT data than others would be appreciated. It seems to me that there could be issues running with using DESeq2 on count data from the SCT assay, due to DESeq2's requirement for raw counts as input. Additionally, I was also wondering if the fact that MAST includes cellular detection rate as a covariate in the model, would make it less appropriate to applying to the SCT counts which have been normalized.

Including an example call of FindMarkers below for how I have been conducting DE analyses in Seurat.

b_cell <- FindMarkers(i.c, assay = "SCT", slot = "data", ident.1 = "b_cell_tx", ident.2 = "b_cell_control", test.use = "MAST", min.pct = 0.1, logfc.threshold = 0.0, verbose = FALSE) Thanks in advance foir your help. Any thoughts at all would be appreciated.

xAZx commented 4 years ago

I have the same issue as above essentially...

I also perform SCTransform and when I use FindMarkers I have noticed that my log-fold changes are extremely high (e.g. log-folds in the 30's!!!). The corresponding adj p-values are extremely small while the pct.1 vs. pct.2 difference is minimal at times, which doesn't make sense to me.

Is it because using the default assay in SCTransform is not ideal as suggested above? Many thanks in advance.

owenwilkins commented 4 years ago

hi @satijalab was just wondering if anyone might be able to provide some thought on this. obviously recognize you are very busy but just wanted to check in incase it got lost among other issues

diazdc commented 4 years ago

I'm not part of the dev team, but this issue has been covered a number of times and is now included in the FAQ (It's number 4) on the lab's home page. The short answer is use assay "RNA" for differential expression and gene expression visualization. You can find more discussions on this topic if search for "integration" or "sctransform" and "differential expression" in all issues.

owenwilkins commented 4 years ago

Thanks for the heads up @diazdc , must have missed that somehow on their FAQs page

Jb-Gorin commented 4 years ago

Hi, This may be a naïve question but if the point of integration is to remove batch effect between datasets, wouldn't running FindMarkers() on the "RNA" assay be equivalent to run DE without correcting for batch?

aelyaderani commented 4 years ago

Hi @Jb-Gorin Were you able to find an answer to your question. I have the same question but can't find the answer anywhere. :(

@satijalab

Jb-Gorin commented 4 years ago

Hi @Jb-Gorin Were you able to find an answer to your question. I have the same question but can't find the answer anywhere. :(

@satijalab

Unfortunately not :/

dlmatera commented 3 years ago

bump - I atleast think it would be good to confirm that our DE analysis will contain batch effects if using the "RNA" assay after dataset integration

Volkan-Ergin commented 3 years ago

I was concerned about the same issue regarding FindMarkers() after SCT normalization, then decided to use RNA assay with default Wilcoxon analysis to retrieve DEGs. It seems reliable, but couldn't get any result from RNA assay with MAST analysis because it constantly fails by giving an error in sanity_check as shown below unless I use assay = integrated. However, DESeq2 works well with RNA assay.

Error in sanity_check_sca(obj) : Assay in position 1, with name et is unlogged. Set check_sanity = FALSE to override and then proceed with caution.

JiahuaQu commented 2 years ago

I have the same issue. I didn' need integration because the data I used was in the same batch. It didn't need batch effect removal. Then I used SCTransform directly. Now I want to calculate the DEG between two conditons (or called: treatments). I didn't change any assay or slot when I used FindMarkers(). I found it strange that no matter using wilcox or MAST, the p-values were always very small, close to 0. From the above discussion, I guessed that we should specify the assay as RNA instread of SCT, but why? And what slot should I use in the assay of RNA? Why? If I need find the marker genes in each cluster, generate DimPlot, DoHeatmap, and DotPlot, should I also use RNA assay? If we need use RNA assay and then perform NormLog a and ScaleData, why should we first use SCTransform? It looks like that the SCTransform is useless at all. Thank you.

radee2k commented 1 year ago

Hey,

Here You will find answers to perhaps all Your questions: http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html. 😉