walkerke / tidycensus

Load US Census boundary and attribute data as 'tidyverse' and 'sf'-ready data frames in R
https://walker-data.com/tidycensus
Other
640 stars 98 forks source link

Is it possible to modify the rate_sleep() parameters for get_acs()? #583

Closed djn34 closed 5 days ago

djn34 commented 1 week ago

Hello,

I wrote a code to download 18 ACS variables for a specified geography level, year, and state combination.

I tested the code for all census tracts in a single state, and it worked seamlessly. However, I run into an issue with "rate_sleep()" when I attempt to run the code for the entire US.

Is it possible to change this parameter to allow get_acs() to run for a longer time while handling this lengthy data request, or will I have to rewrite the code to instead make multiple get_acs() requests?

Here is the function:

demo_acs <- function(year, state, geo_type){

  demo_data <- get_acs(state = state,
                       year = year,
                       geography = geo_type,
                       variables = c(population             = "DP05_0001",
                                     pct_below_poverty      = "S1701_C03_001",
                                     median_income          = "DP03_0062",
                                     pct_high_school        = "DP02_0067P",
                                     pct_some_college       = "DP02_0063",
                                     pct_college            = "DP02_0068P",
                                     pct_65_pl              = "DP05_0024P",
                                     pct_female             = "DP05_0003P",
                                     pct_female_65_pl       = "DP05_0031P",
                                     pct_one                = "DP05_0034P",
                                     pct_non_hispanic_white = "DP05_0079P",
                                     pct_non_hispanic_black = "DP05_0080P",
                                     pct_hispanic           = "DP05_0073P",
                                     pct_asian              = "DP05_0082P",
                                     pct_hawaiian_pi        = "DP05_0083P",
                                     pct_native             = "DP05_0081P",
                                     pct_other              = "DP05_0084P",
                                     pct_two_pl             = "DP05_0035P")) %>%
    mutate(year = year) %>%
    rename_all(tolower) %>%
    select(year, geoid, name, variable, estimate) %>%
    spread(variable, estimate) %>%
    relocate(year, geoid, name, population, pct_below_poverty, median_income,
             pct_high_school, pct_some_college, pct_college, pct_65_pl, pct_female, pct_female_65_pl,
             pct_one, pct_non_hispanic_white, pct_non_hispanic_black, pct_hispanic, pct_asian,
             pct_hawaiian_pi, pct_native, pct_other, pct_two_pl)

}

It works perfectly when I run the function for a single state, like this:

# Download PA Census Tract demographic data pa_demo_tracts <- demo_acs(year = 2022, state = "PA", geo_type = "tract")

But I receive the following error when I try to run it for all US census tracts:

Error in `map()`:
ℹ In index: 1.
Caused by error in `rate_sleep()`:
! Request failed after 3 attempts.
Run `rlang::last_trace()` to see where the error occurred.

Thank you for any help you can provide!

walkerke commented 1 week ago

It took a few minutes, but I was able to run the following successfully:

> x <- demo_acs(2022, c(state.abb, "DC"), "tract")
Getting data from the 2018-2022 5-year ACS
Fetching data by table type ("B/C", "S", "DP") and combining the result.
> x
# A tibble: 84,415 × 21
    year geoid   name  population pct_below_poverty median_income
   <dbl> <chr>   <chr>      <dbl>             <dbl>         <dbl>
 1  2022 010010… Cens…       1865              15.3         60563
 2  2022 010010… Cens…       1861               6.3         57460
 3  2022 010010… Cens…       3492              10.1         77371
 4  2022 010010… Cens…       3987              10.2         73191
 5  2022 010010… Cens…       4121               7.8         79953
 6  2022 010010… Cens…       3256               5.6         68575
 7  2022 010010… Cens…       3513               8.9         86959
 8  2022 010010… Cens…       3839              15.1         64904
 9  2022 010010… Cens…       3369              21.6         58155
10  2022 010010… Cens…       3166               6           87237
# ℹ 84,405 more rows
# ℹ 15 more variables: pct_high_school <dbl>,
#   pct_some_college <dbl>, pct_college <dbl>, pct_65_pl <dbl>,
#   pct_female <dbl>, pct_female_65_pl <dbl>, pct_one <dbl>,
#   pct_non_hispanic_white <dbl>, pct_non_hispanic_black <dbl>,
#   pct_hispanic <dbl>, pct_asian <dbl>, pct_hawaiian_pi <dbl>,
#   pct_native <dbl>, pct_other <dbl>, pct_two_pl <dbl>
# ℹ Use `print(n = ...)` to see more rows

We don't call rate_sleep() directly so I think that's coming through our call to purrr::insistently(). I wonder if you're getting a slowdown / hiccup on the API side.

djn34 commented 4 days ago

Thank you Kyle! I appreciate that you were able to test out my code.

I think it probably is coming from the API. I'm able to rewrite the code to go through one state at a time using map_df() functions from purrr, but that makes the code a little more complicated.

Also, thanks for sharing the c(state.abb, "DC") code above. I didn't know about that, and it will save me some future steps!