ctlab / fgsea

Fast Gene Set Enrichment Analysis
Other
379 stars 67 forks source link

Ranking genes from Sleuth with aggregated p-value #95

Closed StephenRicher closed 3 years ago

StephenRicher commented 3 years ago

Hi,

I have been using Sleuth to compute transcript level abundance and differential expression. The transcript-level p-values can be aggregated to compute differentially expressed genes and I would be interested to feed these genes into fgsea.

The issue is I only have the p-values to go by, as opposed to other directional test-statistics. As I understand, the reasoning for not providing gene-level abundances is because transcripts of the same gene may go up and down, (or cancel each other other) and so direction is not so meaningful after aggregation.

I know something similar has been raised before (#47) and I know since then is the a new scoreType option. My idea was to rank by -log(pvalue) and using scoreType='pos' to use fgsea with raw aggregated p-values. Would this be an appropriate method?

Thanks very much and for developing such a great tool, Stephen

assaron commented 3 years ago

Hi Stephen,

Technically this sounds OK, but it could be a bit hard to interpret. Normal GSEA gives you a direction of change (up- or down- regulation), but in your case it would be a de-regulation. Still, some benchmarks are showing that deregulation statistic actually gives a better ranking and a better control for false-positives, compared to directional statistics and may be even should be the preferred option.

StephenRicher commented 3 years ago

Hi @assaron,

Ok thanks very much, I will proceed with caution!