How to rank gene lists - Githubissues

ftwkoopmans / goat

GOAT: efficient and robust identification of gene set enrichment

Apache License 2.0

8 stars 0 forks source link

You can choose whether to use p-values or log2fc values. Generally one finds more significant gene sets when ranking proteins/genes by their effectsize (/log2fc) as compared to p-values because p-values do not contain information on up/down-regulation and most pathways are co-regulated into either up- or down-regulation in practice (see further Figure 4 in the GOAT paper).

As described in the documentation (this GitHub repo's main page), you'll need to put the gene effectsizes (or in your case, log2fc) values in a column named "effectsize" and your their unadjusted p-values in a column named "pvalue". With these input data set, you can choose how to rank your genes using the "score_type" parameter for function test_genesets():

a. rank by p-value: test_genesets( ... , score_type = "pvalue") b. rank by effectsize: test_genesets( ... , score_type = "effectsize")

ftwkoopmans / goat

How to rank gene lists #5