YuLab-SMU / DOSE

:mask: Disease Ontology Semantic and Enrichment analysis
https://yulab-smu.top/biomedical-knowledge-mining-book/
114 stars 35 forks source link

Parameter "universe" is not properly handled in "enricher" function #74

Closed soumitrakp closed 11 months ago

soumitrakp commented 1 year ago

Please see the following example where I have two terms and I call enricher on a gene group.

term2gene <- read.csv(text = "term,gene
ta,a
ta,d
ta,f
ta,x
ta,y
tb,b
tb,c
tb,e
tb,g
tb,h")

res <- enricher(c('a','c','d','f','x')
                , universe = letters[1:8]
                , pAdjustMethod = "BH"
                , minGSSize = 1
                , maxGSSize = length(universe)
                , pvalueCutoff = 1.0
                , qvalueCutoff = 1.0
                , TERM2GENE = term2gene
                , TERM2NAME = NA)

It gives the following output as.data.frame(res):

   ID Description GeneRatio BgRatio    pvalue  p.adjust    qvalue geneID Count
ta ta          ta       3/5     3/8 0.1785714 0.3571429 0.3571429  a/d/f     3
tb tb          tb       1/5     5/8 1.0000000 1.0000000 1.0000000      c     1

In the above, enricher properly discards the elements outside universe for computing BgRatio but does not for GeneRatio. I was expecting GeneRatio for ta to be |adf|/|acdf| = 3/4.

GuangchuangYu commented 1 year ago

see https://github.com/YuLab-SMU/DOSE/issues/73#issuecomment-1532547982.

markziemann commented 1 year ago

Can you provide an example enricher() command using this option? Thanks

GuangchuangYu commented 11 months ago

run the command before you perform enricher.

options(enrichment_force_universe = TRUE)