walkerke / tidycensus

Load US Census boundary and attribute data as 'tidyverse' and 'sf'-ready data frames in R
https://walker-data.com/tidycensus
Other
640 stars 99 forks source link

Error in get_decennial: API returning unknown variable error for variable returned from call to load_variables #575

Closed sebastianrowan closed 3 months ago

sebastianrowan commented 3 months ago

I am trying to get race and ethnicity data at the block level from the 2020 decennial census using table P5 (https://data.census.gov/table?q=P5&d=DEC%20Demographic%20and%20Housing%20Characteristics).

There are 17 variables included in this table, breaking down the population by race for Hispanic and non-Hispanic ethnicity. The image below shows the variables listed for this dataset from the following call:

x <- load_variables(2020, "dhc")
y <- x[substr(x$name, 1, 2) == "P5",]
View(y)

image

I used this list of variables as an input to my call to get_decennial as follows:

race_eth_vars <- y$name
nh_blocks <- get_decennial(
  geography = "block",
  variables = race_eth_vars,
  state = 33,
  year = 2020,
  output = 'wide',
  geometry = TRUE
)

Running this code returns:

Getting data from the 2020 decennial Census
Using the PL 94-171 Redistricting Data Summary File
Error in `get_decennial()`:
! Error : Your API call has errors.  The API message returned is error: error: unknown variable 'P5_011N'.

The error persists when using "tract", "county", or "state" for geography.

I will investigate this error further as soon as I have more time, but posting this now in case the answer is obvious and I am just overlooking something simple!

walkerke commented 3 months ago

You need the argument sumfile = "dhc" to get the data you want. The default sumfile for 2020 is the PL file, which came out first; I didn't want to change that as I didn't want to break users' existing code.

Of course, it's a huge hassle that the name variable names mean different things across different decennial datasets...

library(tidycensus)
options(tigris_use_cache = TRUE)

x <- load_variables(2020, "dhc")
y <- x[substr(x$name, 1, 2) == "P5",]

race_eth_vars <- y$name

get_decennial(
  geography = "county",
  variables = race_eth_vars,
  state = 33,
  year = 2020,
  sumfile = "dhc",
  output = 'wide',
  geometry = TRUE
)
#> Getting data from the 2020 decennial Census
#> Using the Demographic and Housing Characteristics File
#> Note: 2020 decennial Census data use differential privacy, a technique that
#> introduces errors into data to preserve respondent confidentiality.
#> ℹ Small counts should be interpreted with caution.
#> ℹ See https://www.census.gov/library/fact-sheets/2021/protecting-the-confidentiality-of-the-2020-census-redistricting-data.html for additional guidance.
#> This message is displayed once per session.
#> Simple feature collection with 10 features and 19 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -72.55725 ymin: 42.69699 xmax: -70.61062 ymax: 45.30548
#> Geodetic CRS:  NAD83
#> # A tibble: 10 × 20
#>    GEOID NAME    P5_001N P5_002N P5_003N P5_004N P5_005N P5_006N P5_007N P5_008N
#>    <chr> <chr>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#>  1 33017 Straff…  130889  126975  114599    1434     251    4473      42     403
#>  2 33015 Rockin…  314176  303919  283099    2076     342    6247      79    1236
#>  3 33007 Coos C…   31268   30596   28629     494      84     193       3      71
#>  4 33009 Grafto…   91118   88539   80370     783     180    2855      24     368
#>  5 33001 Belkna…   63705   62444   58714     324     122     582      11     188
#>  6 33003 Carrol…   50107   49338   46947     131     100     358       6     206
#>  7 33011 Hillsb…  422937  389439  342652   10044     630   16413     112    2383
#>  8 33005 Cheshi…   76458   74656   69264     648     146    1067      46     305
#>  9 33013 Merrim…  153808  149928  137252    2536     310    3013      59     575
#> 10 33019 Sulliv…   43063   42241   39123     185     134     403       6     181
#> # ℹ 10 more variables: P5_009N <dbl>, P5_010N <dbl>, P5_011N <dbl>,
#> #   P5_012N <dbl>, P5_013N <dbl>, P5_014N <dbl>, P5_015N <dbl>, P5_016N <dbl>,
#> #   P5_017N <dbl>, geometry <MULTIPOLYGON [°]>

Created on 2024-06-26 with reprex v2.1.0