Closed AlexLi-design closed 3 years ago
Hi -- I'm not totally sure why this is failing, but my guess is the issue is the API is having trouble pulling the 15M+ records for the for all states. If you need PUMS data for all states, you might consider looping through the states one at a time and building in some error handling methods like tryCatch()
or purrr::possibly()
.
Here is an example with your variables from just one (small!) state:
library(tidycensus)
var_housing <- c(
"AGEP", "SEX", "RAC1P", "HISP", "HINCP","SCHL",
"DDRS", "DEAR", "DEYE", "DOUT", "DPHY", "MAR",
"NPF", "FES", "MV", "WORKSTAT", "HHT", "OCPIP", "GRPIP",
"DIVISION", "BLD", "TEN", "VEH", "YBL"
)
pums_data <- get_pums(
variables = var_housing,
state = "VT",
year = 2018,
survey = "acs5"
)
#> Getting data from the 2014-2018 5-year ACS Public Use Microdata Sample
pums_data
#> # A tibble: 31,883 x 29
#> SERIALNO SPORDER WGTP PWGTP AGEP HINCP NPF OCPIP GRPIP DIVISION ST
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 2014000~ 1 5 6 55 39000 1 26 0 1 50
#> 2 2014000~ 2 5 7 56 39000 1 26 0 1 50
#> 3 2014000~ 1 11 12 57 100000 2 7 0 1 50
#> 4 2014000~ 2 11 13 61 100000 2 7 0 1 50
#> 5 2014000~ 1 8 8 71 23200 1 25 0 1 50
#> 6 2014000~ 1 12 12 56 57600 2 11 0 1 50
#> 7 2014000~ 2 12 13 53 57600 2 11 0 1 50
#> 8 2014000~ 1 5 5 67 25410 2 18 0 1 50
#> 9 2014000~ 2 5 5 67 25410 2 18 0 1 50
#> 10 2014000~ 1 7 7 55 94700 2 0 10 1 50
#> # ... with 31,873 more rows, and 18 more variables: BLD <chr>, TEN <chr>,
#> # VEH <chr>, YBL <chr>, FES <chr>, HHT <chr>, MV <chr>, WORKSTAT <chr>,
#> # DDRS <chr>, DEAR <chr>, DEYE <chr>, DOUT <chr>, DPHY <chr>, MAR <chr>,
#> # SCHL <chr>, SEX <chr>, HISP <chr>, RAC1P <chr>
Created on 2020-09-08 by the reprex package (v0.3.0)
Alternatively, you could download the entire PUMS file from the Census FTP:
https://www2.census.gov/programs-surveys/acs/data/pums/2018/5-Year/
It would be the files that have the us
suffix like csv_hus.zip
Wow, it works Matt! You are right. If I pull single states out, that should work. Thanks!
From: Matt Herman notifications@github.com Sent: Tuesday, September 8, 2020 1:26 PM To: walkerke/tidycensus tidycensus@noreply.github.com Cc: Li, Shengxiao lsx@design.upenn.edu; Author author@noreply.github.com Subject: Re: [walkerke/tidycensus] Could not fetch variables in microdata (#292)
Hi -- I'm not totally sure why this is failing, but my guess is the issue is the API is having trouble pulling the 15M+ records for the for all states. If you need PUMS data for all states, you might consider looping through the states one at a time and building in some error handling methods like tryCatch() or purrr::possibly()https://purrr.tidyverse.org/reference/safely.html.
Here is an example with your variables from just one (small!) state:
library(tidycensus)
var_housing <- c( "AGEP", "SEX", "RAC1P", "HISP", "HINCP","SCHL", "DDRS", "DEAR", "DEYE", "DOUT", "DPHY", "MAR", "NPF", "FES", "MV", "WORKSTAT", "HHT", "OCPIP", "GRPIP", "DIVISION", "BLD", "TEN", "VEH", "YBL" )
pums_data <- get_pums( variables = var_housing, state = "VT", year = 2018, survey = "acs5" )
pums_data
Created on 2020-09-08 by the reprex packagehttps://reprex.tidyverse.org (v0.3.0)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/walkerke/tidycensus/issues/292#issuecomment-689025779, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANB552T6IJCBEYIBN5B6PZ3SEZSM3ANCNFSM4RACAZMQ.
@mfherman's solution is the right one. My general advice to tidycensus users has always been to look to other sources if you need bulk Census data pulls, as large requests are going to inevitably put pressure on the API. I'd recommend IPUMS and its companion R package {ipumsr} for these sorts of bulk requests.
Thanks for the terrific package to get the microdata! I am optimistic that someone could help me to solve my problem. I tried to capture data from the PUMS using the developed "tidycensus" data, but it seems something gets wrong. I could run the functions well, but when I run the codes to fetch the data, the reminder shows: Error: Your API call has errors. The API message returned is
HTTP Status 500 -
. I do not know what happens, but I double-checked the variable names and my codes, and it seems everything is correct. I copy/paste my codes as below:rm(list = ls()) install.packages("devtools") remotes::install_github("walkerke/tidycensus",force=TRUE) library(tidycensus) library(tidyverse) library(dplyr) census_api_key("my key",install=TRUE,overwrite=TRUE) readRenviron("~/.Renviron") pums_vars_2018 <- pums_variables %>% filter(year == 2018, survey == "acs5") View(pums_vars_2018) var_housing<-c("AGEP","SEX","RAC1P","HISP", "HINCP","SCHL", "DDRS","DEAR","DEYE","DOUT","DPHY", "MAR","NPF","FES", "MV","WORKSTAT","HHT","OCPIP","GRPIP","DIVISION","BLD","TEN","VEH","YBL") ACS<- get_pums(variables=c("AGEP","SEX","RAC1P","HISP", "HINCP","SCHL", "DDRS","DEAR","DEYE","DOUT","DPHY", "MAR","NPF","FES", "MV","WORKSTAT","HHT","OCPIP","GRPIP","DIVISION","BLD","TEN","VEH","YBL"), state="all", year = 2018, survey = "acs5", show_call = TRUE)
Thanks!