datadesk / census-data-aggregator

Combine U.S. census data responsibly
MIT License
42 stars 9 forks source link

Add methods from ACS General Handbook Ch 8 #7

Closed nkrishnaswami closed 5 years ago

nkrishnaswami commented 5 years ago

Add methods for estimating MOE for

palewire commented 5 years ago

Thank you. I think we will ultimately accept all of these additions to the library. I just want to brush up my own understanding of the math, expand the documentation a little and strengthen the unittests.

I've pulled down your PR to a branch and pushed some incremental updates to the repo. They don't show up here in the PR but maybe we can use this space to walk through them together.

The proportion documentation in the ACS handbook uses the single female population in suburban Virginia as its example. I'd like to add those values to your unittest. I've done so here, but I suspect I'm misunderstanding what the expected result should be. What am I missing?

palewire commented 5 years ago

I'm also curious about the reason for multiplying by 1.0 here. Doesn't the result come out the same regardless?

nkrishnaswami commented 5 years ago

There is a missing squaring of a denominator for ratio and proportions. I've updated that, and the testcases to check the examples in the Handbook.

palewire commented 5 years ago

Thanks for the updates.

How would you feel about substituting from __future__ import division to the top of the script to provide reverse compatibility on division of integers?

palewire commented 5 years ago

Screenshot from 2019-07-01 08-22-22

As far as the proportion formula, am I correct that a synonymous implementation would be something like:

squared_proportion_moe = numerator_moe**2 - (proportion_estimate**2 * denominator_moe**2)
proportion_moe = (1.0 / denominator_estimate) * math.sqrt(squared_proportion_moe)
palewire commented 5 years ago

Educate me. What is the functional difference between the proportion and the ratio code? The plus instead of the minus?

palewire commented 5 years ago

I've also noticed a percent change formula in the handbook. You think we should considering adding that too, or not?

nkrishnaswami commented 5 years ago

Whoops, missed the percent change section. I'd be happy to make another PR for that.

nkrishnaswami commented 5 years ago

Re proportion vs ratio, the first order Taylor approximation for variance of a function independent random variables X and Y (so we can ignore the covariance term) is variance approximation for f(X,Y) I haven't chugged through the algebra, but I expect the sign difference between ratio and proportion comes from whether f(X,Y) is ratio var approx or proportion var approx