walkerke / tidycensus

Load US Census boundary and attribute data as 'tidyverse' and 'sf'-ready data frames in R
https://walker-data.com/tidycensus
Other
640 stars 99 forks source link

get_acs() - Is there a limit of ZCTAs one can pass for the zcta argument given geography="zcta"? #550

Closed gah-bo closed 9 months ago

gah-bo commented 9 months ago

I am trying to pull in data for all ZCTA in the state of NY. Since geography="zcta" does not have the option to filter by state I instead opted to pass the list of ZCTAs themselves into the zcta argument. The vector of ZCTAs IDs I want comes from downloading the ZCTA:Counties Census reference file and keeping all rows in which the county is in NY.

Code: Note I am well aware a ZCTA can be linked to >1 county; I use unique() to address this.

library(dplyr)
library(openxlsx)
requireNamespace("tidycensus")

library(haven) # exports Stata

includedStates <- c(36) #NY FIPS as per https://www.mercercountypa.gov/dps/state_fips_code_listing.htm
includedZCTAs <- subset(read.table("https://www2.census.gov/geo/docs/maps-data/data/rel2020/zcta520/tab20_zcta520_county20_natl.txt", 
                                   header=TRUE, 
                                   sep="|",
                                   colClasses=c(GEOID_COUNTY_20="character", GEOID_ZCTA5_20="character")),
                        substr(GEOID_COUNTY_20, 1, 2) %in% includedStates,
                        select=c(GEOID_COUNTY_20, NAMELSAD_COUNTY_20, GEOID_ZCTA5_20, NAMELSAD_ZCTA5_20)
                        )

TEST_CALL <- tidycensus::get_acs(geography="zcta",
                                 table="B19013",
                                 year=2021,
                                 zcta=unique(includedZCTAs$GEOID_ZCTA5_20)[1:925], # anything over 925 causes some error!
                                 output="wide",
                                 show_call=TRUE)

The issue is that I want zcta=unique(includedZCTAs$GEOID_ZCTA5_20) (ie: without the [1:925]). However, it seems like if I change 925 to anything above that, I get

Error: Your API call has errors. The API message returned is There was an error while running your query. We've logged the error and we'll correct it ASAP. Sorry for the inconvenience..

...and if I set zcta=unique(includedZCTAs$GEOID_ZCTA5_20) then I get

Error in curl::curl_fetch_memory(url, handle = handle) : Send failure: Connection was reset

walkerke commented 9 months ago

Possibly! That would be a Census API issue (I believe) and not a tidycensus issue. Perhaps there is a max length of the URL that can be sent to the API.

It sounds like your best bet is to chunk your ZCTAs on the R side and re-assemble them. You might also consider the filter_by argument which allows you to retrieve data within a given overlay (e.g. a state shape).