lcolladotor / derfinder

Annotation-agnostic differential expression analysis of RNA-seq data via expressed regions-level or single base-level approaches
http://lcolladotor.github.io/derfinder
42 stars 15 forks source link

How Fstats are generated in analyzeChr() #33

Closed MonicaVara closed 9 years ago

MonicaVara commented 9 years ago

Hi,

I am using derfinder to compare two groups of samples (31 vs 17) and I want to optimize (and understand) my statistics. However, it is not clear to me how this is built in analyzeChr().

As it is based on F-modelling, is cutoffFstat the alpha or the Fstat threshold itself?

In the example I also see that when you switch from theoretical to empirical the cutoffFstat switches from 1-08 to 0.99, and I don't understand why.

Additionally, I would like to know which statistical method is used to calculate FWER, as I cannot find it anywhere.

Could you clarify all these aspects to me, please?

Thank you in advance.

Mónica

lcolladotor commented 9 years ago

Hi Mónica,

When comparing two groups of samples you will end up calculating t-statistics instead of F-statistics. The argument cutoffFstat is by default the alpha given that cutoffType = 'theoretical' by default. If you change cutoffType = 'manual' then cutoffFstat is the actual threshold.

The example is a very small subset of data with not that strong of a signal, that is why I use 0.99 there. It's just for showing how to use the functions, not a real application. The actual applications are at http://leekgroup.github.io/derSoftware/ which is the supplementary website for the derfinder paper. See bioRxiv for the pre-print.

For the FWER method, check the this paper as well as the bioRxiv pre-print. Basically, we compare the observed areas against the maximum area per permutation to control the FWER.

Best, Leonardo