benfulcher / GeneCategoryEnrichmentAnalysis

Toolbox for performing gene set enrichment analysis in Matlab (including ensemble enrichment)
GNU General Public License v3.0
17 stars 6 forks source link

Inefficient computation #1

Open benfulcher opened 4 years ago

benfulcher commented 4 years ago

The main outer loop in ComputeAllCategoryNulls is across GO categories, with an inner loop computing the relevant statistic (e.g., a pairwise correlation) for matching genes. For genes annotated to multiple categories, this repeats an identical correlation computation multiple times. It would be much faster to compute the statistic for all genes first, and then agglomerate them into categories.