zalando / expan

Open-source Python library for statistical analysis of randomised control trials (A/B tests)
MIT License
335 stars 50 forks source link

Group Sequential - Percentile Issue #176

Closed louisryan closed 6 years ago

louisryan commented 6 years ago

Hi there,

I have upgraded from Expan 0.6.2 -> 0.6.5 and upon re-running the group sequential method, the percentiles are showing incorrect values:

screen shot 2017-12-20 at 16 58 01

Upon downgrading, the issue has been resolved.

shansfolder commented 6 years ago

hi @louisryan by "incorrect" do you mean percentile becomes 0 and 100? This can happen when the current data size (2065 in your example) is much less your provided estimated sample size.

gbordyugov commented 6 years ago

@shansfolder I'm wondering that this problem appeared after the version upgrade

shansfolder commented 6 years ago

@louisryan this might be actually a bug fix instead of a problem :) could you tell us what is the estimated sample size you are using? so that I can confirm for you..

louisryan commented 6 years ago

@shansfolder, in my case, estimated sample size is 50000. After lowering it to 10000, the percentiles do change to 2.5/97.5. So this is expected behaviour? And is the fix you put in place previously?

shansfolder commented 6 years ago

hi @louisryan , yes this is a fix. There was a bug fix on this line in 0.6.2, and it changed to this in 0.6.5.

shansfolder commented 6 years ago

@louisryan let me try to explain intuitively: when the current sample size is small(compared to estimated sample size), to be conservative, we make the confidence interval very large. A very large interval infers that it should cover 0 --- so it's almost impossible to conclude significance when sample size is still small.

louisryan commented 6 years ago

@shansfolder that makes sense, but do you not think that the confidence interval should be displayed regardless? With stop set to False?

The reason why I say this is down to experience running experiments with one core success metric, but other metrics of interest that we want to monitor that might only achieve a fraction of the sample size. Using the fixed horizon approach would yield confidence intervals but would not be statistically significant(lower bound not crossing the zero line). With the fix in place, we loose the confidence interval representation.

Would love to know your thoughts

shansfolder commented 6 years ago

Hi, @louisryan my understanding is that if the group sequential method gives you stop equals false, it means the results are not statistically valid(due to it only achieve a fraction of the sample size). Therefore, we shouldn't use the values of confidence intervals in this case anyways.

But I admit the way of our result format is confusing. Let me know if you have any suggestions. :)

louisryan commented 6 years ago

Good point! :)

louisryan commented 6 years ago

Thanks for the clarification