zalando / expan

Open-source Python library for statistical analysis of randomised control trials (A/B tests)
MIT License
335 stars 50 forks source link

Infinitely large confidence intervals produced by group_sequential_delta() #172

Closed gbordyugov closed 7 years ago

gbordyugov commented 7 years ago

Sometimes percentiles as low as 0% and as high as 100%

{
  "kpis": [
    {
      "variants": [
        {
          "delta_statistics": {
            "treatment_mean": 0.06499121465525656,
            "stop": false,
            "treatment_sample_size": 528934,
            "control_sample_size": 543608,
            "delta": -0.0014944855291734171,
            "confidence_interval": [
              {
                "value": -1.7976931348623157e+308,
                "percentile": 0
              },
              {
                "value": 1.7976931348623157e+308,
                "percentile": 100
              }
            ],
            "statistical_power": 0.9631769752638585,
            "control_mean": 0.06648570018442998
          },
          "name": "YYY"
        }
  ],
  "errors": [],
  "control_variant": "XXX",
  "expan_version": "0.6.3",
  "warnings": []
}
shansfolder commented 7 years ago

@gbordyugov this might happen in very early days in the experiment (information fraction is too low), it's simply impossible to stop in this case, which leads to a 0-100% percentile and infinitely large interval.

gbordyugov commented 7 years ago

@shansfolder I see, but in this case we have a sample size of 5x10^5, the estimated around 10^7, which makes the current sample size be around 5% of the estimated. Do you think it's too small?

shansfolder commented 7 years ago

hmm I see. The alpha spending function think it's too small then. What's your usecase? we can discuss in person. I think I can come tomorrow. ;)

shansfolder commented 7 years ago

As the result of investigation, the infinitely large confidence interval is produced when alpha is very almost zero. And a almost-zero alpha happens when information fraction is less than 40% from the O'Brien-Fleming alpha spending function.

Since we haven chosen the most conservative spending function, this makes sense.

Notebook for investigation: https://github.bus.zalan.do/axolotl/experimentation-library/blob/master/alpha-spending-function-too-small-alpha/investigate_very_large_confidence_intervals.ipynb