Bayer-Group / pybalance

A library for minimizing the effects of confounding covariates
BSD 3-Clause "New" or "Revised" License
11 stars 0 forks source link

Approximation for Standardized Difference #26

Open sprivite opened 1 week ago

sprivite commented 1 week ago

The linear optimizer cannot do standardized mean difference, but there is a good approximation it can use, namely:

SMD = (mean(pool) - mean(target))/ (sqrt(var(pool) + var(target)) \approx (mean(pool) - mean(target))/ (sqrt(2 var(target))= (absolute mean difference) / (sqrt(2 var(target))

When distance(pool, target) >> 1, this is a poor approximation, but the approximation becomes increasingly good as pool --> target, which is anyway what the optimizer is doing.

Implement this as an option to ConstraintSatisfactionMatcher

sprivite commented 1 week ago

This only works when the target size is fixed to the input target size.