UDST / synthpop

Synthetic populations from census data
BSD 3-Clause "New" or "Revised" License
100 stars 46 forks source link

Mismatch between `hh` and `p` tables regarding the amount of persons for 2018 results #60

Closed PyMap closed 4 years ago

PyMap commented 4 years ago

Allocating households and persons from block group to block we found differences between synthetic tables.

image

To sum up, we generally found that we often have more persons in household table than persons in persons table. Also, more household_idx in first table than in second one.

We already checked consistency between ACS tracts and pums2018 files and every value has a puma10 file matching by state. At the moment, most strong hypothesis is that we are missing serial numbers suring the synthesis.

From checked cases, only ST 05, county 001 has been correctly synthesized.

PyMap commented 4 years ago

Solved (casting SERIALNO and serialno in _read_csv function from census helpers. New serials contain group quartes and dtype is no longer integer