wayfair / pylift

Uplift modeling package.
BSD 2-Clause "Simplified" License
369 stars 77 forks source link

Balancing for small control group #40

Open shaddyab opened 4 years ago

shaddyab commented 4 years ago

If I have a control group which is ~2% of the overall data compared to ~98% treatment group (i.e., p=0.98), should the training data be balance such that it will have a 50/50 split between control and treatment? Otherwise, the negative Inverse probability weight Multiplier for the control group (-1/(1-p) = -50) will be much larger than the positive Inverse probability weight Multiplier for the treatment group (1/p = 1.02).