zalando / expan

Open-source Python library for statistical analysis of randomised control trials (A/B tests)
MIT License
335 stars 50 forks source link

Percentiles value is lost during computing group_sequential_delta #108

Closed shansfolder closed 7 years ago

shansfolder commented 7 years ago

The parameter percentiles is lost during the following stack of calls: -> experiment.group_sequential_delta (has input para percentiles) -> early_stopping.group_sequential (lost input para percentiles) -> statistics.normal_difference (has para percentiles again, but now we can only use the default value since the percentiles value user passed will be lost in previous step)

gbordyugov commented 7 years ago

@shansfolder I don't see Experiment.group_sequential_delta() having percentiles arguments: https://github.com/zalando/expan/blob/dev/expan/core/experiment.py#L492

shansfolder commented 7 years ago

Sorry I meant delta(). I assumed the argument list should be the same.

As a user, if delta() takes percentiles as para, (because fixed_horizon_delta need it), then I expect it is used in group_sequential_delta too. And after a few calls in the stack, group_sequential_delta does need this information in statistics.normal_difference.

gbordyugov commented 7 years ago

@shansfolder please check out the latest pull request — I tried to get rid of all intermediate parameter passing by capturing them in a closure deltaWorker (which, in turn, is generated by make_delta(...) in statistics.py. This approach however didn't work for the Bayesian methods, for they have a different argument signature and use a different statistics worker. We could try to think how to overcome this.

compare: https://github.com/zalando/expan/pull/110