Closed kaelgreco closed 7 years ago
When running the example notebook, Doppelganger.ipynb, the population output in step 03 seems incorrect.
Doppelganger.ipynb
Pandas DataFrame for first 5 people:
tract serial_number repeat_index age sex individual_income 138842 422209 4431 0 65+ M <=0 138843 422209 4431 1 65+ M <=0 138897 422209 4431 0 35-64 F <=0 138898 422209 4431 1 35-64 F 100000+ 54123 422209 12930 0 35-64 M 40000-80000
Pandas DataFrame for first 5 households:
tract serial_number repeat_index num_people household_income \ 0 422209 4431 0 2 <=40000 1 422209 4431 1 2 40000+ 90603 422209 4431 0 2 40000+ 90604 422209 4431 1 2 40000+ 181206 422209 4431 0 2 40000+ num_vehicles 0 1.0 1 1.0 90603 2.0 90604 2.0 181206 2.0
I expected to see sequential non-duplicate repeat indices for tract, serial_number pairs in households, e.g. the repeat indices column would be 0,1,2,3,4
tract
serial_number
When running the example notebook,
Doppelganger.ipynb
, the population output in step 03 seems incorrect.Pandas DataFrame for first 5 people:
Pandas DataFrame for first 5 households:
I expected to see sequential non-duplicate repeat indices for
tract
,serial_number
pairs in households, e.g. the repeat indices column would be 0,1,2,3,4