UDST / synthpop

Synthetic populations from census data
BSD 3-Clause "New" or "Revised" License
99 stars 47 forks source link

Rounding problem for Census BG marginals #51

Open semcogli opened 4 years ago

semcogli commented 4 years ago

Right now, "Cehavees_helper" downloads and splits Tract controls into Block Group controls when only Tract summaries are available. However, the " _scale_and_merge" function uses astype (int) to convert the final division results, which could lead to unmatched marginal totals. I noticed there's a comment saying "round?"(line 47). But the rounding wasn't implemented. I am wondering why?


Test for State 26, County 125, Tract 165100, BG (1,2,3), hh cars and hh workers only have tract summaries. At the county level, we see thousands of HHs less in those 2 categories.

Current method, hh_age_of_head 869 598 277 hh_cars 866 596 275 hh_children 869 598 277 hh_income 869 598 277 hh_race_of_head 869 598 277 hh_size 869 598 277 hh_workers 867 597 275 hispanic_head 869 598 277

Round first then astype(int)(much better) hh_age_of_head 869 598 277 hh_cars 869 598 277 hh_children 869 598 277 hh_income 869 598 277 hh_race_of_head 869 598 277 hh_size 869 598 277 hh_workers 869 597 277 hispanic_head 869 598 277