YuLab-SMU / enrichplot

Visualization of Functional Enrichment Result
https://yulab-smu.top/biomedical-knowledge-mining-book/
234 stars 65 forks source link

dotplot shows only one term #270

Closed lbwfff closed 8 months ago

lbwfff commented 8 months ago

Hi, enrichplot developer.

I'm having an issue with clusterProfiler enrichment analysis and visualising it using enrichplot, the dotplot only shows the most enriched one term and I'm not sure why this is the case. Here is my code:

genelist<-bitr(list, fromType = "SYMBOL",toType = c( "ENTREZID"),OrgDb = org.Mm.eg.db)[,2]

kk.negative <- enrichKEGG(gene  = genelist,
                          organism = 'mmu',
                          pvalueCutoff = 0.05,
                          qvalueCutoff =1)
kk.negative<-setReadable(kk.negative,OrgDb = 'org.Mm.eg.db', keyType="ENTREZID")

dotplot(kk.negative,color = "pvalue", showCategory=10)

Here are the enrichment results:


> kk.negative@result[1:5,]
               ID
mmu05168 mmu05168
mmu04080 mmu04080
mmu04010 mmu04010
mmu04659 mmu04659
mmu04928 mmu04928
                                                                              Description
mmu05168                    Herpes simplex virus 1 infection - Mus musculus (house mouse)
mmu04080             Neuroactive ligand-receptor interaction - Mus musculus (house mouse)
mmu04010                              MAPK signaling pathway - Mus musculus (house mouse)
mmu04659                           Th17 cell differentiation - Mus musculus (house mouse)
mmu04928 Parathyroid hormone synthesis, secretion and action - Mus musculus (house mouse)
          GeneRatio  BgRatio       pvalue    p.adjust      qvalue
mmu05168 0.17567568 456/9467 4.482798e-05 0.007217305 0.007217305
mmu04080 0.10810811 391/9467 1.097983e-02 0.757428202 0.757428202
mmu04010 0.08108108 301/9467 2.991354e-02 0.757428202 0.757428202
mmu04659 0.04054054 105/9467 4.878844e-02 0.757428202 0.757428202
mmu04928 0.04054054 108/9467 5.227160e-02 0.757428202 0.757428202
                                                                                        geneID
mmu05168 Bcl2/Zfp12/Zfp78/Zfp418/Zfp619/Zfp109/Zfp709/Zfp617/Zfp748/Zfp729a/Irf9/Zfp870/Zfp275
mmu04080                                       Grin1/Oprd1/Npy5r/Pth1r/Grik2/Thrb/Rxfp3/Adra2a
mmu04010                                                    Csf1/Flt1/Nfatc3/Myc/Cacng2/Il1rap
mmu04659                                                                     Nfatc3/Ahr/Il1rap
mmu04928                                                                       Bcl2/Pld1/Pth1r
         Count
mmu05168    13
mmu04080     8
mmu04010     6
mmu04659     3
mmu04928     3

GO enrichment similarly shows only one term, Here is the GO enrichment code:

BP <- enrichGO(gene  = genelist,
               OrgDb   = org.Mm.eg.db,
               ont  = 'BP' , pAdjustMethod = "BH",
               pvalueCutoff  = 0.05, qvalueCutoff  = 1,
               readable      = TRUE)

p[[2]]<-
  dotplot(BP,color = "pvalue", showCategory=10)

The GO enrichment results are a bit more significant than the KEGG results (I'm not sure if it's because these term are not significant enough)

> BP@result[1:5,]
                   ID                                         Description
GO:0006338 GO:0006338                                chromatin remodeling
GO:0061138 GO:0061138             morphogenesis of a branching epithelium
GO:0048754 GO:0048754       branching morphogenesis of an epithelial tube
GO:0006346 GO:0006346 DNA methylation-dependent heterochromatin formation
GO:0001763 GO:0001763              morphogenesis of a branching structure
           GeneRatio   BgRatio       pvalue   p.adjust     qvalue
GO:0006338    14/214 489/28891 1.899346e-05 0.04818641 0.04442470
GO:0061138     9/214 229/28891 5.666300e-05 0.05450326 0.05024843
GO:0048754     8/214 191/28891 9.449664e-05 0.05450326 0.05024843
GO:0006346     4/214  33/28891 1.011932e-04 0.05450326 0.05024843
GO:0001763     9/214 249/28891 1.074168e-04 0.05450326 0.05024843
                                                                                             geneID
GO:0006338 Kdm5b/Ino80d/Zfp335/Chd7/Nasp/Bicra/Kcnq1ot1/Morc2a/Msl1/Smarce1/Tasor2/Kat6b/Myc/Morc2b
GO:0061138                                          Sulf1/Kdm5b/Bcl2/Nrarp/Csf1/Flt1/Nfatc3/Ahr/Myc
GO:0048754                                                Kdm5b/Bcl2/Nrarp/Csf1/Flt1/Nfatc3/Ahr/Myc
GO:0006346                                                               Kcnq1ot1/Morc2a/Myc/Morc2b
GO:0001763                                          Sulf1/Kdm5b/Bcl2/Nrarp/Csf1/Flt1/Nfatc3/Ahr/Myc
           Count
GO:0006338    14
GO:0061138     9
GO:0048754     8
GO:0006346     4
GO:0001763     9

This is the dotplot output

Screenshot 2024-03-19 at 8 50 09 PM

Why is this happening and how can I fix it

Thanks, LeeLee

guidohooiveld commented 8 months ago

This is because in the dotplot only significant gene sets are plotted; and the significance in both your analyses are based on the value of p.adjust.

In your first code chunk, in which you called the gseKEGG function, you did NOT specify the argument pAdjustMethod, only pvalueCutoff = 0.05. Since by default pAdjustMethod = "BH", significance is thus based on Benjamini-Hochberg adjusted p-values (i.e. column p.adjust), and thus not the 'raw' p-values (column pvalue). Note that the qvalue is another, related metric!

As you can see, in your enrichKEGG result there is only 1 gene set that has a value of p.adjust that is smaller than 0.05! Namely: 0.007217305.

The same applies to your enrichGO analysis; you set indeed the argument pAdjustMethod = "BH", and as cutoff value pvalueCutoff = 0.05. Only 1 gene set has p.adjust <0.05, namely 0.04818641.

So: either set the argument pAdjustMethod to "none", or increase the value of the argument pvalueCutoff.

From ?enrichKEGG: pAdjustMethod -- one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".

lbwfff commented 8 months ago

Ok I see, I had previously thought that dotplot was only relevant to kk.negative@result results, thank you!