DOI-USGS / dataRetrieval

This R package is designed to obtain USGS or EPA water quality sample data, streamflow data, and metadata directly from web services.
https://doi-usgs.github.io/dataRetrieval/
Other
260 stars 84 forks source link

`countyCd` does not reflect changes in Census handling of CT boundaries #711

Open steeleb opened 3 months ago

steeleb commented 3 months ago

Hi {dataRetrieval} team! I came across a bug (but maybe an enhancement?) today for the state of CT due to some US Census changes in 2022.

Describe the bug In June 2022, FIPS county codes were retired for the state of CT. The US Census Bureau has created new FIPS codes for the State of CT reflecting 'planning regions'. Currently, the new FIPS codes are not represented in the dataRetrieval::countyCd list which creates issues when using these new planning regions in {dataRetrieval}. Neither the Planning Region name nor the new FIPS assigned to CT work in {dataRetrieval}.

To Reproduce

Using State and County names for the pull.

library(dataRetrieval)
library(tigris)
library(tidyverse)

# list county FIPS in CT according to {dataRetrieval}
ex1 <- countyCd %>% filter(STATE == '09') %>% pull(COUNT)

# list county FIPS in CT according to {tigris} 2022
ex2 <- counties(year = 2022) %>% filter(STATEFP == '09') %>% pull(NAMELSAD)

# list county FIPS in CT according to {tigris} 2020
ex3 <- counties(year = 2020) %>% filter(STATEFP == '09') %>% pull(NAMELSAD)

# get site list for example 1
map(.x = ex1, 
    .f = ~ whatWQPsites(statecode = "CT", countycode = .x)) %>% 
  bind_rows()

# same for example 2
map(.x = ex2,
    .f = ~ whatWQPsites(statecode = "CT", countycode = .x)) %>% 
  bind_rows()

# same for example 3
map(.x = ex3,
    .f = ~ whatWQPsites(statecode = "CT", countycode = .x)) %>% 
  bind_rows()

Here, ex2 fails in the whatWQPsites() fails because the new "County" names (in this case Planning Regions) are not listed in the countyCd table:

Screenshot 2024-06-24 at 1 36 15 PM

If I provide the FIPS numbers, this fails for the new CT names as well:

# list county FIPS in CT according to {dataRetrieval}
ex4 <- countyCd %>% filter(STATE == '09') %>% pull(COUNTY)

# list county FIPS in CT according to {tigris} 2022
ex5 <- counties(year = 2022) %>% filter(STATEFP == '09') %>% pull(COUNTYFP)

# list county FIPS in CT according to {tigris} 2020
ex6 <- counties(year = 2020) %>% filter(STATEFP == '09') %>% pull(COUNTYFP)

# get site list for example 4 using the county FIPS
map(.x = ex4, 
    .f = ~ whatWQPsites(statecode = "09", countycode = .x)) %>% 
  bind_rows()

# same for example 5
map(.x = ex5,
    .f = ~ whatWQPsites(statecode = "09", countycode = .x)) %>% 
  bind_rows()

# same for example 6
map(.x = ex6,
    .f = ~ whatWQPsites(statecode = "09", countycode = .x)) %>% 
  bind_rows()

Again, example 5 fails, but no clear error provided, since the format provided is as described in the error (I'm guessing it is again relying on the countyCd table, and since the county FIPS codes don't match, it assigns the state as NA).

Screenshot 2024-06-24 at 1 35 37 PM

Thanks all, please let me know if you need further information!

Session Info

# session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.0 (2024-04-24)
#>  os       macOS Sonoma 14.5
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/Denver
#>  date     2024-06-24
#>  pandoc   3.1.11 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cachem        1.0.8   2023-05-01 [1] CRAN (R 4.4.0)
#>  cli           3.6.2   2023-12-11 [1] CRAN (R 4.4.0)
#>  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.4.0)
#>  digest        0.6.35  2024-03-11 [1] CRAN (R 4.4.0)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.4.0)
#>  evaluate      0.23    2023-11-01 [1] CRAN (R 4.4.0)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.4.0)
#>  fs            1.6.4   2024-04-25 [1] CRAN (R 4.4.0)
#>  glue          1.7.0   2024-01-09 [1] CRAN (R 4.4.0)
#>  htmltools     0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
#>  htmlwidgets   1.6.4   2023-12-06 [1] CRAN (R 4.4.0)
#>  httpuv        1.6.15  2024-03-26 [1] CRAN (R 4.4.0)
#>  knitr         1.46    2024-04-06 [1] CRAN (R 4.4.0)
#>  later         1.3.2   2023-12-06 [1] CRAN (R 4.4.0)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.4.0)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.4.0)
#>  memoise       2.0.1   2021-11-26 [1] CRAN (R 4.4.0)
#>  mime          0.12    2021-09-28 [1] CRAN (R 4.4.0)
#>  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.4.0)
#>  pkgbuild      1.4.4   2024-03-17 [1] CRAN (R 4.4.0)
#>  pkgload       1.3.4   2024-01-16 [1] CRAN (R 4.4.0)
#>  profvis       0.3.8   2023-05-02 [1] CRAN (R 4.4.0)
#>  promises      1.3.0   2024-04-05 [1] CRAN (R 4.4.0)
#>  purrr         1.0.2   2023-08-10 [1] CRAN (R 4.4.0)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.4.0)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.4.0)
#>  R.oo          1.26.0  2024-01-24 [1] CRAN (R 4.4.0)
#>  R.utils       2.12.3  2023-11-18 [1] CRAN (R 4.4.0)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.4.0)
#>  Rcpp          1.0.12  2024-01-09 [1] CRAN (R 4.4.0)
#>  remotes       2.5.0   2024-03-17 [1] CRAN (R 4.4.0)
#>  reprex        2.1.0   2024-01-11 [1] CRAN (R 4.4.0)
#>  rlang         1.1.3   2024-01-10 [1] CRAN (R 4.4.0)
#>  rmarkdown     2.26    2024-03-05 [1] CRAN (R 4.4.0)
#>  rstudioapi    0.16.0  2024-03-24 [1] CRAN (R 4.4.0)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.4.0)
#>  shiny         1.8.1.1 2024-04-02 [1] CRAN (R 4.4.0)
#>  stringi       1.8.4   2024-05-06 [1] CRAN (R 4.4.0)
#>  stringr       1.5.1   2023-11-14 [1] CRAN (R 4.4.0)
#>  styler        1.10.3  2024-04-07 [1] CRAN (R 4.4.0)
#>  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.4.0)
#>  usethis       2.2.3   2024-02-19 [1] CRAN (R 4.4.0)
#>  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.4.0)
#>  withr         3.0.0   2024-01-16 [1] CRAN (R 4.4.0)
#>  xfun          0.43    2024-03-25 [1] CRAN (R 4.4.0)
#>  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.4.0)
#>  yaml          2.3.8   2023-12-11 [1] CRAN (R 4.4.0)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Created on 2024-06-24 with reprex v2.1.0

lstanish-usgs commented 2 months ago

Hi @steeleb and thank you for bringing this to our attention! We're investigating on the best course of action for resolution and will update this issue when we have more information.