Closed tim-peterson closed 3 years ago
Thanks @tim-peterson I share the same concern. I was also wondering which is the background set used in the Fisher's Exact test.
FWIW answer to be found in https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-128#Sec2 (unless its changed since then) which begins:
Computing enrichment Enrichr implements three approaches to compute enrichment. The first one is a standard method implemented within most enrichment analysis tools: the Fisher exact test.
@EidrianGM - Presumably the background is the entire set against which enrichment is being calculated
@EidrianGM - Presumably the background is the entire set against which enrichment is being calculated
Presumably yes indeed @malcook but in detail what is the entire set? All the genes from Uniprot? Ensembl? HGCN? NCBI? only protein coding? considering the whole genome or only those annotated in each database/gene set?
I agree with @EidrianGM. It would be good to include information on the background gene sets for each of the tests. Better yet I’d love to get access to the gene sets so we could adjust them in our own way.
We used 20K as a hard-coded value for the background for the Fisher Exact Test. All the libraries are available for download from here: https://maayanlab.cloud/Enrichr/#stats
Thanks so much for producing Enrichr. It's an amazing tool.
Every tab provides an odds ratio and a p-value. It would be good in the FAQ to explain what type of statistics go into those calculations. It would make Enrichr less of a black box.
Thanks again!