facebookresearch / balance

The balance python package offers a simple workflow and methods for dealing with biased data samples when looking to infer from them to some target population of interest.
https://import-balance.org
GNU General Public License v2.0
681 stars 40 forks source link

[FEATURE] Add a warning message to Balance when trying to run very large/imbalanced weights #84

Open talgalili opened 2 months ago

talgalili commented 2 months ago

Add a warning message to Balance when trying to run very large/imbalanced weights (say, anything more than like 100k cases and population frame that's 10x the sample). The thinking here is that, if the population is >10x the sample, then basically all the standard error in comparing (sample vs population) is coming from the sample rather than the population. It’s comparable to a 1-sample t-test rather than a 2-sample t-test.

Idea from: Ben Mainwaring

talgalili commented 2 months ago

TODO (thoughts):