zalando / expan

Open-source Python library for statistical analysis of randomised control trials (A/B tests)
MIT License
334 stars 50 forks source link

Re-wrote chi-square, removed dropping buckets. #234

Closed daryadedik closed 6 years ago

daryadedik commented 6 years ago

As discussed I removed category dropping (now we don't drop any categories), min_counts is input parameter (I set it to 5 as default), but in ADS afterward we will decide which threshold to use. Also added number of filtered entities per variant in exp.metadata.

coveralls commented 6 years ago

Coverage Status

Coverage increased (+0.02%) to 92.473% when pulling c95697db139eb78bc204344bf1a831c170db200b on improve-chi-square into 2b7bc27c5a2736ffbd7dcde1f442cf7bcd731464 on master.

shansfolder commented 6 years ago

@daryadedik I think I need a meeting in person to understand this PR ;)

daryadedik commented 6 years ago

Ok, as discussed I just returned previous chi-square implementation (nothing changed here), plus some small code beatifying made, and removed category dropping (now we don't drop any categories), min_counts is input parameter (I set it to 5 as default), but in ADS afterward we will decide which threshold to use.

daryadedik commented 6 years ago

Sorry for mixing up the PR. I also added number of filtered entities per variant in exp.metadata. Will need that information instead computing the filtered number on ADS side.