Open markbauby opened 2 days ago
Interesting. We don't do any post-processing of the PUMS data like that in tidycensus, so whatever you're seeing is what's coming through the Census API.
You can take a look here: https://api.census.gov/data/2022/acs/acs5/pums?get=SERIALNO%2CSPORDER%2CWGTP%2CPWGTP%2CAGEP%2CESR%2CWAGP%2CPERNP%2CSCHL&ucgid=0400000US26
Perhaps the IPUMS team does some post-processing of the data?
Ive noticed a possible issue in regard to microdata through tidycensus and the wage data.
Long story short I have been working on micro data with IPUMSR and tidycensus packages and while I can get close to similiar results it looks like some of the variables within tidycensus are rounded. Specifically the "WAGP" and "PERNP" variables. While their equilavents in IPMUSR ("INCWAGE" and "INCEARN") are not.
Is this a bug/error in tidycensus or is it from user error on my part?
My code is below.
IPUMSR segment
ipums_extract_test <- define_extract_micro( collection = "usa", description = "USA extract for API vignette", samples = c("us2022c"), variables = c("AGE", "STATEFIP", "EMPSTAT", "INCWAGE", "INCEARN", "us2022c_schl"))
ipums_data <- ipums_extract_test %>% submit_extract() %>% wait_for_extract() %>% download_extract() %>% read_ipums_micro()
ipums_test <- ipums_data %>% filter(STATEFIP == 26 & AGE >= 16 & EMPSTAT == 1 & US2022C_SCHL %in% 1:21)
tidycensus segment
tidy_test <- get_pums( year = 2022, survey = "acs5", state = "MI", variables = c("AGEP", "ESR", "WAGP", "PERNP", "SCHL") ) %>% filter(AGEP >= 16 & (ESR == 1 | ESR == 2 | ESR == 4 |ESR == 5) & SCHL %in% 1:21)
Thank you.