zalando / expan

Open-source Python library for statistical analysis of randomised control trials (A/B tests)
MIT License
335 stars 50 forks source link

fixing sample size estimation #188

Closed gbordyugov closed 6 years ago

gbordyugov commented 6 years ago

We still don't know how to handle multiple variants.

The old version had both n as r as parameters.

gbordyugov commented 6 years ago

Ok, we can run sample size estimation against control for all variants (except the control itself), giving us a sample size estimation for each variant and then take the largest of them.

What do you think?

gbordyugov commented 6 years ago

@shansfolder this method is called from our ExpanWeb service and the data is currently saved along other experiment results.

shansfolder commented 6 years ago

@gbordyugov I see. Then I would suggest to discuss the signature (e.g. n is not used) and properly document this method.

One more question: if there are more variants, do we run this method multiple times and take the sum?

gbordyugov commented 6 years ago

@shansfolder will do

as to your second questions, see my comment above — the one ending with "What do you think"?

jbao commented 6 years ago

@gbordyugov I like your idea of doing the iterative sample size estimation for multivariate tests, maybe it's also possible to do the power calculation in terms of the Chi-squared statistic (instead of t- or z-statistic), not quite sure though.

PS: really impressed by the quick hot-fix;-)