issues
search
djsutherland
/
pummeler
Utilities to analyze ACS PUMS files, especially for distribution regression / ecological inference
MIT License
21
stars
7
forks
source link
Featurization issues
#6
Open
djsutherland
opened
8 years ago
djsutherland
commented
8 years ago
MIGPUMA
has joint meaning with
MIGSP
; same for
POWPUMA
/
POWSP
.
Why does
RELP
come up so much in the ridge models? What does it mean in practice?
djsutherland
commented
7 years ago
CITWP, YOEP, JWMNP: mean-coding blanks might not be the right thing, since blank means the person was born in the US / doesn't work
MLP* (when served in military) could probably be simplified; VPS does that
NWAB, NWAV, NWLA, NWLK, NWRE are recoded into ESR
RELP (relationship to reference person) is kind of a weird feature
hierarchical featurization for ANC_P / FOD_P / INDP / NAICSP / OCCP / SOCP?
for NAICS/SOC see
https://www.census.gov/people/io/methodology/indexes.html
merge ANC1P/ANC2P, RAC1P/RAC2P/...?
re-featurize JWAP/JWDP to be circular?
Things that refer to specific in-US places: MIGPUMA, MIGSP, POBP, POWPUMA, POWSP
POVPIP: pretty sharp featurization difference between 500 and 501
MIGPUMA
has joint meaning withMIGSP
; same forPOWPUMA
/POWSP
.RELP
come up so much in the ridge models? What does it mean in practice?