Btw in the last PR https://github.com/avast/ep-stats/pull/40, we talked about Bonferroni vs Holm-Bonferroni correction. Holm-Bonferroni can be applied here because we already have the $p$-values. However, it would result in each variant having very different required_sample_size because the correction depends on the $p$-value. I think it's better to just stick with the classic Bonferroni and use the most conservative $\alpha$ for all variants so that the required sizes are equal.
Consider an example with 4 variants and $p$-values $p_B = 0.001, p_C = 0.005, p_D = 0.01$.
Metrics now have an optional
minimal_effect
argument that is used to compute the sample size required to reach 80% power.REST API example:
Python API example:
Btw in the last PR https://github.com/avast/ep-stats/pull/40, we talked about Bonferroni vs Holm-Bonferroni correction. Holm-Bonferroni can be applied here because we already have the $p$-values. However, it would result in each variant having very different
required_sample_size
because the correction depends on the $p$-value. I think it's better to just stick with the classic Bonferroni and use the most conservative $\alpha$ for all variants so that the required sizes are equal.Consider an example with 4 variants and $p$-values $p_B = 0.001, p_C = 0.005, p_D = 0.01$.