mlr-archive / mlr-tutorial

The mlr package online tutorial
http://mlr-org.github.io/mlr/
20 stars 11 forks source link

Wrong formula for FDR #120

Closed alvinthai closed 6 years ago

alvinthai commented 7 years ago

https://github.com/mlr-org/mlr-tutorial/blob/76157380287fca598a4e564c3e700386e7ff734a/devel/html/measures/index.html#L416

FDR isn't a straight forward formula based on FP, TN, and FN. It's based on the expectation of FP divided over actual FP. See this article for more calculation details: http://www.d.umn.edu/~rregal/documents/5411_2010/False_Discovery_Rates_Example.pdf

FDR is a metric for conceptualizing multiple comparisons, note should be updated to reflect this understanding.

jakob-r commented 7 years ago

As you say, the FDR is mainly discussed as a metric to quantify the rate of Type I errors when testing on multiple comparisons.

So the definition says:

Q = V/(V+S)
FDR = E(Q)

But when we calculate the fdr-measure on a set of predicted outcomes and a the true outcomes V and S are no random variables anymore but a state.

pat-s commented 6 years ago

Can we close this @jakob-r ?

jakob-r commented 6 years ago

yes