satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.31k stars 920 forks source link

Differential expression for one single gene #7456

Closed MarieCoutelier closed 1 year ago

MarieCoutelier commented 1 year ago

Hi, It's probably a really basic question here...

I understand how to use FindMarkers to find the DE genes between two conditions in my data.

However, I'm interested in finding out whether the expression of one given gene is different between two subpopulations of my Seurat object. How can I answer this question? Can I use FindMarkers while specifying features = "my_gene"? It gives me a value, but does it make sense to use it this way? If so, is the default Wilcoxon test appropriate? And what does the p_val_adj corresponds to then, since I'm only comparing one gene and in my understanding, it should be the same as the p_val while using Bonferroni correction?

Here are some examples of results I get; I do get values, but I'm stuck on how to interpret them, and how to interpret the adjusted p value.

> FindMarkers(seurat_list[[1]], features = "Htt", ident.1 = 47, ident.2 = 40, logfc.threshold = 0)
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s  
        p_val avg_log2FC pct.1 pct.2 p_val_adj
Htt 0.2164881 0.05182677 0.538 0.598         1

> FindMarkers(seurat_list[[1]], features = "Cacna1a", ident.1 = 47, ident.2 = 40, logfc.threshold = 0)
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s  
               p_val  avg_log2FC pct.1 pct.2    p_val_adj
Cacna1a 2.166909e-09 -0.09083438   0.8 0.809 4.953554e-05

> FindMarkers(seurat_list[[1]], features = "Notch1", ident.1 = 47, ident.2 = 40, logfc.threshold = 0, min.pct = 0)
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s  
              p_val  avg_log2FC pct.1 pct.2    p_val_adj
Notch1 8.269798e-19 -0.05789144 0.023 0.047 1.890476e-14

Thanks a lot! Marie

longmanz commented 1 year ago

Hi, Yes, you can definitely do DE for one gene at a time. The p_val_adj, on the other hand, is not what as you mentioned. It will be p_val * (total number of gene in your seurat_obj). We implemented this in this way to make sure the genes are truly "Bonferroni-corrected". Otherwise, the number of genes will be smaller after the filtering the genes with default logfc.threshold, min.pct, etc. .