walkerke / tigris

Download and use Census TIGER/Line shapefiles in R
Other
324 stars 45 forks source link

tigris::blocks does not retrieve POP20 and HU20 consistently #159

Closed monkeywithacupcake closed 1 year ago

monkeywithacupcake commented 1 year ago

In the book, section 7.3.2 says that blocks() from tigris will get POP20 and HU20 which make great weights

2020 Census blocks acquired with the tigris package have the added benefit of POP20 and HU20 columns in the dataset that represent population and housing unit counts, respectively, either one of which could be used to weight each block.

And, when I test this for Maricopa County, AZ, I get the same behavior. However, for a local county, I do not get POP20 or HU20. The "failure" as it were is silent.

I don't know that it is a failure, but these two functions both run with no commentary. One of them has POP20 and HU20 and the other does not:

king_blocks <- blocks(  # king county includes Seattle
    state = "WA",
    county = "King",
    year = 2020
)
# no POP20 and no HU20

maricopa_blocks <- blocks(
  state = "AZ",
  county = "Maricopa",
  year = 2020
)
# has  both
monkeywithacupcake commented 1 year ago

a work around is to separately get the POP20 and then combine it all together, like so:

king_block_pop <- get_decennial(
  geography = "block",
  variables = "P1_001N",
  year = 2020,
  state = "WA",
  county = "King",
  geometry = FALSE
) 

king_blocks <- king_blocks %>%
  left_join(select(king_block_pop, GEOID20=GEOID, POP20=value))
walkerke commented 1 year ago

Are you using shapefile caching? Census has released a few new versions of these shapefiles. The original 2020 shapefiles didn't have population and housing unit columns; a later version added POP20 and HU20; the newest version renamed HU20 to HOUSING20 after I wrote the book.

When I run it without shapefile caching, I see:

library(tigris)
#> To enable caching of data, set `options(tigris_use_cache = TRUE)`
#> in your R script or .Rprofile.

king_blocks <- blocks(  
  state = "WA",
  county = "King",
  year = 2020,
  progress_bar = FALSE
)

king_blocks
#> Simple feature collection with 27686 features and 17 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -122.5417 ymin: 47.08435 xmax: -121.0659 ymax: 47.78058
#> Geodetic CRS:  NAD83
#> First 10 features:
#>    STATEFP20 COUNTYFP20 TRACTCE20 BLOCKCE20         GEOID20     NAME20 MTFCC20
#> 5         53        033    008800      3006 530330088003006 Block 3006   G5040
#> 6         53        033    008800      2002 530330088002002 Block 2002   G5040
#> 7         53        033    007800      3025 530330078003025 Block 3025   G5040
#> 11        53        033    001800      2008 530330018002008 Block 2008   G5040
#> 12        53        033    011401      2006 530330114012006 Block 2006   G5040
#> 13        53        033    006702      1010 530330067021010 Block 1010   G5040
#> 14        53        033    004800      3016 530330048003016 Block 3016   G5040
#> 15        53        033    006000      4010 530330060004010 Block 4010   G5040
#> 16        53        033    022604      2014 530330226042014 Block 2014   G5040
#> 25        53        033    027300      1001 530330273001001 Block 1001   G5040
#>    UR20 UACE20 UATYPE20 FUNCSTAT20 ALAND20 AWATER20  INTPTLAT20   INTPTLON20
#> 5     U  80389        U          S   14537        0 +47.6034838 -122.2994388
#> 6     U  80389        U          S   11226        0 +47.6084963 -122.2955872
#> 7     U  80389        U          S   11763        0 +47.6110379 -122.2920486
#> 11    U  80389        U          S   15919        0 +47.6916534 -122.3459026
#> 12    U  80389        U          S   20362        0 +47.5255704 -122.3705838
#> 13    U  80389        U          S    9575        0 +47.6267255 -122.3495140
#> 14    U  80389        U          S   11410        0 +47.6559547 -122.3599361
#> 15    U  80389        U          S    3069        0 +47.6505560 -122.3602843
#> 16    U  80389        U          S   41197        0 +47.6708895 -122.1788156
#> 25    U  80389        U          S   10381        0 +47.4847634 -122.2885307
#>    HOUSING20 POP20                       geometry
#> 5         29    62 MULTIPOLYGON (((-122.3 47.6...
#> 6         12    25 MULTIPOLYGON (((-122.2962 4...
#> 7         21    50 MULTIPOLYGON (((-122.2926 4...
#> 11        26    68 MULTIPOLYGON (((-122.3472 4...
#> 12        21    67 MULTIPOLYGON (((-122.3713 4...
#> 13       148   142 MULTIPOLYGON (((-122.3502 4...
#> 14        22    54 MULTIPOLYGON (((-122.3608 4...
#> 15         0     0 MULTIPOLYGON (((-122.3609 4...
#> 16        32    99 MULTIPOLYGON (((-122.1802 4...
#> 25         5    24 MULTIPOLYGON (((-122.2892 4...

Created on 2023-02-10 by the reprex package (v2.0.1)

If you are using shapefile caching, you can add the optional argument refresh = TRUE to trigger a re-download of the shapefile to your cache. This should have the updated columns.