Closed sebastianrowan closed 3 months ago
You need the argument sumfile = "dhc"
to get the data you want. The default sumfile
for 2020 is the PL file, which came out first; I didn't want to change that as I didn't want to break users' existing code.
Of course, it's a huge hassle that the name variable names mean different things across different decennial datasets...
library(tidycensus)
options(tigris_use_cache = TRUE)
x <- load_variables(2020, "dhc")
y <- x[substr(x$name, 1, 2) == "P5",]
race_eth_vars <- y$name
get_decennial(
geography = "county",
variables = race_eth_vars,
state = 33,
year = 2020,
sumfile = "dhc",
output = 'wide',
geometry = TRUE
)
#> Getting data from the 2020 decennial Census
#> Using the Demographic and Housing Characteristics File
#> Note: 2020 decennial Census data use differential privacy, a technique that
#> introduces errors into data to preserve respondent confidentiality.
#> ℹ Small counts should be interpreted with caution.
#> ℹ See https://www.census.gov/library/fact-sheets/2021/protecting-the-confidentiality-of-the-2020-census-redistricting-data.html for additional guidance.
#> This message is displayed once per session.
#> Simple feature collection with 10 features and 19 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -72.55725 ymin: 42.69699 xmax: -70.61062 ymax: 45.30548
#> Geodetic CRS: NAD83
#> # A tibble: 10 × 20
#> GEOID NAME P5_001N P5_002N P5_003N P5_004N P5_005N P5_006N P5_007N P5_008N
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 33017 Straff… 130889 126975 114599 1434 251 4473 42 403
#> 2 33015 Rockin… 314176 303919 283099 2076 342 6247 79 1236
#> 3 33007 Coos C… 31268 30596 28629 494 84 193 3 71
#> 4 33009 Grafto… 91118 88539 80370 783 180 2855 24 368
#> 5 33001 Belkna… 63705 62444 58714 324 122 582 11 188
#> 6 33003 Carrol… 50107 49338 46947 131 100 358 6 206
#> 7 33011 Hillsb… 422937 389439 342652 10044 630 16413 112 2383
#> 8 33005 Cheshi… 76458 74656 69264 648 146 1067 46 305
#> 9 33013 Merrim… 153808 149928 137252 2536 310 3013 59 575
#> 10 33019 Sulliv… 43063 42241 39123 185 134 403 6 181
#> # ℹ 10 more variables: P5_009N <dbl>, P5_010N <dbl>, P5_011N <dbl>,
#> # P5_012N <dbl>, P5_013N <dbl>, P5_014N <dbl>, P5_015N <dbl>, P5_016N <dbl>,
#> # P5_017N <dbl>, geometry <MULTIPOLYGON [°]>
Created on 2024-06-26 with reprex v2.1.0
I am trying to get race and ethnicity data at the block level from the 2020 decennial census using table P5 (https://data.census.gov/table?q=P5&d=DEC%20Demographic%20and%20Housing%20Characteristics).
There are 17 variables included in this table, breaking down the population by race for Hispanic and non-Hispanic ethnicity. The image below shows the variables listed for this dataset from the following call:
I used this list of variables as an input to my call to get_decennial as follows:
Running this code returns:
The error persists when using "tract", "county", or "state" for
geography
.I will investigate this error further as soon as I have more time, but posting this now in case the answer is obvious and I am just overlooking something simple!