DOI-USGS / dataRetrieval

This R package is designed to obtain USGS or EPA water quality sample data, streamflow data, and metadata directly from web services.
https://doi-usgs.github.io/dataRetrieval/
Other
256 stars 85 forks source link

readNWISdata returning more parameter codes than requested #679

Closed fbiles closed 6 months ago

fbiles commented 9 months ago

What is your question? I am trying to retrieve all data matching specific parameter codes for all sites within a bBox:

cations <- readNWISdata(bBox=c(-140.0, 54.6, -126.7, 61.7), service="qw", parameterCd=cation.codes)

cation.codes contains 16 parameter codes as class character, e.g.: [1] "00410" "00419" "00440" "00450" "00453" "00915" "00925" "00930" "00935" "00940" "29801" "39036" "39086" "90410" [15] "99220" "99440"

The query runs and returns the expected results, EXCEPT, the output dataframe includes about 677 parameter code columns and is not restricted to the 16 parameter codes specified. This has been stumping me all day. Any hints to what I am doing wrong are greatly appreciated.

Expected behavior A resulting dataframe containing all surface water sites within the bBox that have any data for only the 16 specified parameter codes.

Session Info

devtools::session_info() ─ Session info setting value version R version 4.3.1 (2023-06-16 ucrt) os Windows 10 x64 (build 19045) system x86_64, mingw32 ui RStudio language (EN) collate English_United States.utf8 ctype English_United States.utf8 tz America/Anchorage date 2023-10-03 rstudio 2023.06.1+524 Mountain Hydrangea (desktop) pandoc NA

─ Packages package version date (UTC) lib source bit 4.0.5 2022-11-15 [1] CRAN (R 4.3.1) bit64 4.0.5 2020-08-30 [1] CRAN (R 4.3.1) cachem 1.0.8 2023-05-01 [1] CRAN (R 4.3.1) callr 3.7.3 2022-11-02 [1] CRAN (R 4.3.1) cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.3.1) cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.1) crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.1) curl 5.0.1 2023-06-07 [1] CRAN (R 4.3.1) dataRetrieval 2.7.12 2023-08-10 [1] Github (DOI-USGS/dataRetrieval@b920321) devtools 2.4.5 2022-10-11 [1] CRAN (R 4.3.1) digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.1) dplyr 1.1.2 2023-04-20 [1] CRAN (R 4.3.1) ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.3.1) fansi 1.0.4 2023-01-22 [1] CRAN (R 4.3.1) fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.1) fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.1) generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.1) glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.1) hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.1) htmltools 0.5.5 2023-03-23 [1] CRAN (R 4.3.1) htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.1) httpuv 1.6.11 2023-05-11 [1] CRAN (R 4.3.1) httr 1.4.6 2023-05-08 [1] CRAN (R 4.3.1) later 1.3.1 2023-05-02 [1] CRAN (R 4.3.1) lattice 0.21-8 2023-04-05 [1] CRAN (R 4.3.1) lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.1) lubridate 1.9.2 2023-02-10 [1] CRAN (R 4.3.1) magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.1) memoise 2.0.1 2021-11-26 [1] CRAN (R 4.3.1) mime 0.12 2021-09-28 [1] CRAN (R 4.3.0) miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.3.1) pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.1) pkgbuild 1.4.2 2023-06-26 [1] CRAN (R 4.3.1) pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.1) pkgload 1.3.2.1 2023-07-08 [1] CRAN (R 4.3.1) prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.3.1) processx 3.8.2 2023-06-30 [1] CRAN (R 4.3.1) profvis 0.3.8 2023-05-02 [1] CRAN (R 4.3.1) promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.3.1) ps 1.7.5 2023-04-18 [1] CRAN (R 4.3.1) purrr 1.0.1 2023-01-10 [1] CRAN (R 4.3.1) R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.1) Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.3.1) readr 2.1.4 2023-02-10 [1] CRAN (R 4.3.1) readxl 1.4.3 2023-07-06 [1] CRAN (R 4.3.1) remotes 2.4.2.1 2023-07-18 [1] CRAN (R 4.3.1) rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.1) rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.1) sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.1) shiny 1.7.4.1 2023-07-06 [1] CRAN (R 4.3.1) stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0) stringr 1.5.0 2022-12-02 [1] CRAN (R 4.3.1) tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.1) tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.1) timechange 0.2.0 2023-01-11 [1] CRAN (R 4.3.1) tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.1) urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.3.1) usethis 2.2.2 2023-07-06 [1] CRAN (R 4.3.1) utf8 1.2.3 2023-01-31 [1] CRAN (R 4.3.1) vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.1) vroom 1.6.3 2023-04-28 [1] CRAN (R 4.3.1) withr 2.5.0 2022-03-03 [1] CRAN (R 4.3.1) writexl 1.4.2 2023-08-11 [1] Github (ropensci/writexl@b138c22) xml2 1.3.5 2023-07-06 [1] CRAN (R 4.3.1) xtable 1.8-4 2019-04-21 [1] CRAN (R 4.3.1) xts 0.13.1 2023-04-16 [1] CRAN (R 4.3.1) zoo * 1.8-12 2023-04-13 [1] CRAN (R 4.3.1)

[1] C:/Program Files/R/R-4.3.1/library

ldecicco-USGS commented 9 months ago

2 things...

  1. the "readNWISdata" function is not designed for the "qw" service. It's subtle, but the "qw" service is not mentioned in the allowed services.
  2. the NWIS qw service is going to be shut down in the near future (I've been hearing Feb/March?). So you'll need to start getting use to the WQP service for qw data. Here's how to request that data:
cation.codes <- c("00410", "00419", "00440", "00450", "00453", "00915", "00925", "00930",
                  "00935", "00940", "29801", "39036", "39086", "90410", "99220", "99440")

cations_WQP <- readWQPdata(bBox=c(-140.0, 54.6, -126.7, 61.7),
                       pCode=cation.codes)

You can read more here: https://doi-usgs.github.io/dataRetrieval/articles/qwdata_changes.html

If you really wanted to use the readNWISdata with the "qw" service, you'd need to figure out how to get the correct arguments from here: https://nwis.waterdata.usgs.gov/nwis/qwdata?search_criteria=lat_long_bounding_box&submitted_form=introduction

I think it would be like this, which should give you a clue why we don't recommend using readNWISdata with the "qw" service (even before the "qw" service was set to be turned off):

x <- readNWISdata(service = "qw",
                  format = "rdb",
             "nw_longitude_va"=-140,
             "nw_latitude_va"=61.7,
             "se_longitude_va"=-126.7,
             "se_latitude_va"=54.6,
             "coordinate_format"="decimal_degrees",
             "format"="station_list",
             "group_key"="NONE",
             "inventory_output"=0,
             "TZoutput"=0,
             "radio_parm_cds"="parm_cd_list",
             "radio_multiple_parm_cds" = cation.codes,
             "qw_attributes"=0,
             "qw_sample_wide"="wide",
             "rdb_qw_attributes"=0,
             "date_format"="YYYY-MM-DD",
             "list_of_search_criteria"="lat_long_bounding_box")

But again....that service is going away!

fbiles commented 9 months ago

Thank you Laura! That did the trick. I updated my code to the WQP functions.

For what I am doing, what is the difference between retrieving the desired wq data using readWQPdata vs readWQPqw? Is it only that readWQPqw doesn't allow use of a bBox and readWQPdata has a lot more querying options? Otherwise it seems it fetches the same data?

ldecicco-USGS commented 9 months ago

Exactly, both readWQPdata and readWQPqw fetch data from the same place. The readWQPqw function only takes site id's as the input - which makes it a little easier to describe (assuming people already know the site id). The readWQPdata function takes any inputs that the WQP allows, so it's much more flexible, but also a bit harder to use.

Starting here: https://rconnect.usgs.gov/NMC_dataRetrieval_1/dataRetrieval_1.html#/wqp-queries-dane-county-wi-example and moving forward (use the right arrow key), you can see how to use the readWQPdata function a little more in depth. Also the examples in the help file (?readWQPdata) can be helpful.