Question about percentiles in Bayesian early stopping

zalando / expan

Open-source Python library for statistical analysis of randomised control trials (A/B tests)

MIT License

335 stars 50 forks source link

Question about percentiles in Bayesian early stopping #109

Closed shansfolder closed 7 years ago

shansfolder commented 7 years ago

As I understood from code, percentiles values for regular delta and group sequential delta are using t statistics. e.g. in the result object, "pctile" could be (2.5, 50, 97.5), "value" can be the corresponding t statistics (0.1, 0.8, 0.1).

On the other hand, percentiles values for two Bayesian delta are using a fixed 0.95 credible interval. In the result object, "pctile" is always ("lower", "upper"), "value" is the corresponding index of posterior distribution.

I think they are two completely different concept, and if I understood correctly, should we put them into different columns in the result object?

shansfolder commented 7 years ago

ok, on second thought, they have something in common in higher abstraction. ;)

One can find the "value" of the random variable being tested, given a defined metric "pctile"

in frequentist approaches pctile is the percentile of the test distribution
whereas in Bayesian approaches pctile is the lower/upper bound of a 0.95 credible interval of the test distribution.

mkolarek commented 7 years ago

exactly, sorry haven't had time to look at this sooner. @gbordyugov has worked on a new concept for the Results structure so these issues will be taken into account, but for now I would not mess with it much... for testing the early stopping methods I believe the current implementation should suffice.

shansfolder commented 7 years ago

closed since the structure of result object is under planed