Closed raruggie closed 1 year ago
The NWIS "qwdata" service is a very unique web service. The readNWISdata
was originally only designed to work with services that used the https://waterservices.usgs.gov service calls (qwdata uses https://nwis.waterdata.usgs.gov/nwis/qwdata).
If you've got a somewhat recent version of dataRetrieval you should be getting a Warning when you run the readNWISdata
function with the argument service = "qwdata". The warning is:
Warning message:
NWIS qw web services are being retired.
Please see vignette('qwdata_changes', package = 'dataRetrieval')
for more information.
https://cran.r-project.org/web/packages/dataRetrieval/vignettes/qwdata_changes.html
Sometime in the not so distant future we won't be able to ask for discrete water quality data from the NWIS web services. Instead, we'll all need to switch to using the Water Quality Portal. So for your query above, you'd want to use:
preferred_query <- readWQPdata(siteid = "USGS-01200000",
startDateLo = as.Date("2019-10-01"),
startDateHi = as.Date("2019-10-05"))
There didn't seem to be any data returned for that date. I checked the available data like this:
whats_available <- whatNWISdata(siteNumber = "01200000")
what_qw <- whats_available |>
filter(data_type_cd == "qw")
range(what_qw$begin_date)
[1] "1992-10-19" "1992-10-20"
range(what_qw$end_date)
[1] "1992-10-19" "1992-10-20"
So there doesn't seem to be data in Oct. 2019, only Oct. 1992. For the sake of additional examples, I'll ask for data in October 1992. If you want to use the NWIS function readNWISqw
....keep in mind it will literally stop working sometime in 2023 (that's the latest I've heard). So I wouldn't recommend this, but if you need it for a quick fix:
limited_use <- readNWISqw("01200000",
parameterCd = "all",
startDate = as.Date("1992-10-01"),
endDate = as.Date("1992-10-31"))
If you are looking for a set of sites for specific dates, you can use this simplified function:
preferred_query_2 <- readWQPqw(siteNumbers = "USGS-01200000",
parameterCd = "",
startDate = as.Date("1992-10-01"),
endDate = as.Date("1992-10-31"))
If you are curious why we never optimized the readNWISdata
function for the water quality data, here is the input you would need to get the water quality data from that function.
questionable_query <- readNWISdata(multiple_site_no = "01200000", service = "qwdata",
sort_key = "site_no",
group_key = "NONE",
inventory_output = 0,
begin_date = as.Date("1992-10-01"),
end_date = as.Date("1992-10-31"),
radio_parm_cds = "all_parm_cds",
date_format = "YYYY-MM-DD",
list_of_search_criteria = "multiple_site_no",
rdb_compression = "value",
qw_attributes = 0,
qw_sample_wide = 0,
rdb_qw_attributes = "expanded")
Laura, thank you very much for this. For some reason if I use readNWIS data with service = qwdata, or I use readNWISqw, I don't get an error (I only downloaded dataRetrival package a week or two ago, I believe I used install.packages). Just to clarify, in 2023 readNWISdata and readNWISqw will not work but readWQPqw will?
Sometime in 2023, we'll be losing the qwdata services from NWIS. So, readNWISqw
will stop working entirely, and readNWISdata
will not be able to use service="qwdata"
(which as this issue illustrates...is not a great use for readNWISdata
anyway). You will still be able to use readNWISdata
for all other data types (dv, uv, gwlevels, etc...)
You should be seeing a "Warning" (not an error) when using either readNWISqw
or if you use service="qwdata" in readNWISdata
.
Going forward, for discrete water quality data from the USGS, you'll want to use the "WQP" functions: https://rconnect.usgs.gov/dataRetrieval/reference/#water-quality-portal-wqp-
See: https://cran.r-project.org/web/packages/dataRetrieval/vignettes/qwdata_changes.html
I see that my dataRetrival version 2.7.6 and the help file you provided (https://cran.r-project.org/web/packages/dataRetrieval/vignettes/qwdata_changes.html) says the warning comes starting with version 2.7.9., makes sense I don't get the warning.
Should I also migrate whatNWISdata function over to the whatWQPdata function? They return different dataframes, but can the same info from the NWIS version of this function be extracted from the 'WQP' version of this function? The NWIS version returned multiple rows for each site inputted where each row was a parameter for a site, while WQP returns a single row for every site (I couldn't find the metadata for the columns activity count and result count, but I assume those correspond to the number of unique parameters for those sites?)
here is my code:
library(tidyverse)
library(dataRetrieval)
# NWIS functions
NY <- whatWQPsites(statecode="US:36",characteristicName = "Phosphorus", siteType="Stream")%>%
filter(ProviderName == "NWIS")%>% # need to filter NWIS site as whatNWISdata only works with NWIS sites?
mutate(site_cd = substring(MonitoringLocationIdentifier, 6)) # create new column with just the number part of the site
site_data_NWIS <- whatNWISdata(siteNumber = NY$site_cd)
# WQP functions
NY <- whatWQPsites(statecode="US:36",characteristicName = "Phosphorus", siteType="Stream")
site_data_WQP<-whatWQPdata(siteNumber = NY$MonitoringLocationIdentifier)
My understanding is that there will always be some service that will be available from NWIS to get that same info that is currently being supplied in the whatNWISdata
function. There's a good chance that the guts of the function will need to be re-written someday, but I confirmed that there is a priority to make sure we can ask we'll be able to find out all the data available for a USGS site from a single NWIS query even after the qwdata service is shut down.
So...you can continue to use the whatNWISdata
function (especially if you are absolutely only concerned with USGS sites....if you want to get water quality data from non-USGS data, you'll need to use the whatWQPdata
function)
The function in WQP that is most similar to whatNWISdata
is a newer function readWQPsummary
. There's no equivalent output, but in the summary function, you can find out what data is available each year. This article shows an example for how to use it:
https://rconnect.usgs.gov/dataRetrieval/articles/wqp_large_pull_script.html
I am having trouble with the readWQPsummary
function, it is returning an empty dataframe:
library(dataRetrieval)
state_cd_cont <- stateCd[c(2,4:12,14:52),]
rownames(state_cd_cont) <- seq(length=nrow(state_cd_cont))
i<-30 # this is New York in state_cd_cont dataframe
state_cd <- state_cd_cont$STATE[i]
state_nm <- state_cd_cont$STUSAB[i]
ny <- readWQPsummary(statecode = state_cd,CharacteristicName = "Nitrogen",siteType = "Stream") # returns empty dataframe
ny <- readWQPsummary(statecode="US:36",siteType = "Stream") # still returns empty dataframe
ny <- whatWQPdata(statecode="US:36",siteType = "Stream") # this works but not the same as whatNWISdata dataframe
I've also tried this for different states (i.e. changing 'i') but still returns an empty dataframe
I suspect the issue could be I am using dataRetrieval
version 2.7.6 and the link you sent says 2.7.11.1... I tried to update the package by install.packages('dataRetrieval')
in a new R studio session (https://stackoverflow.com/questions/21461649/how-to-update-a-package-in-r), which ran successfully, but when I check sessionInfo()
it still says I have 2.7.6
My R version is 4.0.2, and I see that it does not recognize the new R pipe |>
implemented in version 4.1 (https://www.r-bloggers.com/2021/05/the-new-r-pipe/), could this be the issue?
Try updating to the latest github version. I expect to push out some updates to CRAN soon, but not sure exactly when that will happen. So, first you'll need the "remotes" package. Then, you can install like this:
#install.packages("remotes") #run this if you don't have the remotes package already installed
remotes::install_github("USGS-R/dataRetrieval")
when I run the code you posted above, I do get a data frame, so try updating and let me know if that works. I don't see any modern pipes in the code, so I don't think that should matter (unless you are trying to run the full "scripting" article I posted above... you could change those pipes to %>% and it should work).
readWQPsummary
works when updating to 2.7.11.1 thank you!
When I run readWQPsummary
and whatNWISdata
with the same search criteria (New York, Phosphorus, streams), I am getting unexpected results. I expect that the WQP function will return a more comprehensive dataset than the NWIS function, in terms of earliest sample date on record, but that is not the case:
library(dataRetrieval)
library(dplyr)
# WQP functions
ny_WQP_read <- readWQPsummary(statecode = "36",CharacteristicName = "Phosphorus",siteType = "Stream")
min(ny_WQP_read$YearSummarized) # 1951
# NWIS functions
ny_NWIS_sites <- whatWQPsites(statecode="US:36",characteristicName = "Phosphorus", siteType="Stream")%>%
filter(ProviderName == "NWIS")%>% # need to filter NWIS site for whatNWISdata to execute
mutate(site_cd = substring(MonitoringLocationIdentifier, 6)) # create new column with just the number code of the NWIS site to use in whatNWISdata
ny_NWIS_what <- whatNWISdata(siteNumber = ny_NWIS_sites$site_cd)
min(na.omit(ny_NWIS_what$begin_date)) # 1785...? this year should be equal to or greater than the WQP since this is only querying NWIS sites, where WQP querys all water quality databases?
length(unique(ny_NWIS_what$site_no))
length(unique(ny_WQP_read$MonitoringLocationIdentifier)) # at least the WQP function gives more sites
This may seem trivial but I am just curious, I really do appreciate all of your help, thank you
The function whatNWISdata
will give you more information about ALL the USGS data, not just discrete water-quality. So even though you are inputting the distinct USGS sites that came back from the whatWQPsites
, you are getting information about all sorts of other data: daily (discharge or other), instantaneous, peak, etc.... So to make a good comparison of the 2 outputs, filter down to just the data_type_cd == "qw" and then the pcode associated with phosphorus:
# NWIS functions
ny_NWIS_sites <- whatWQPsites(statecode="US:36",characteristicName = "Phosphorus", siteType="Stream")%>%
filter(ProviderName == "NWIS")%>% # need to filter NWIS site for whatNWISdata to execute
mutate(site_cd = substring(MonitoringLocationIdentifier, 6)) # create new column with just the number code of the NWIS site to use in whatNWISdata
ny_NWIS_what <- whatNWISdata(siteNumber = ny_NWIS_sites$site_cd)
# Check out all the different kind of returned data!
pcodes <- readNWISpCode(unique(ny_NWIS_what$parm_cd))
# narrow down to phos:
phos_pcodes <- pcodes |>
filter(grepl("Phosphorus", parameter_nm))
# probably you want 00665, that's the most common pcode for phosphorus
#the only data that would match in NWIS and WQP is data_type_cd == "qw":
ny_NWIS_what_qw <- ny_NWIS_what |>
filter(data_type_cd == "qw",
parm_cd == "00665")
old_site <- ny_NWIS_what_qw$site_no[which.min(ny_NWIS_what_qw$begin_date)]
old_site <- ny_NWIS_what_qw$site_no[which.min(ny_NWIS_what_qw$begin_date)]
# NWIS:
old_data_qw_NWIS <- readNWISqw(old_site,
parameterCd = "all")
range(old_data_qw_NWIS$sample_dt)
[1] "1969-08-19" "1971-05-08"
# WQP summary:
ny_WQP_read1 <- readWQPsummary(siteid = paste0("USGS-", old_site))
min(ny_WQP_read1$YearSummarized)
[1] 1969
# WQP data:
old_data_qw_WQP <- readWQPqw(paste0("USGS-", old_site),
"Phosphorus")
range(old_data_qw_WQP$ActivityStartDate)
[1] "1969-08-19" "1971-05-08"
I think everything is consistent....let me know if that doesn't make sense.
What is your question? 'readNWISdata' funciton returns a dataframe with multiple sites when one is given in the 'sites' argument.
To Reproduce
Expected behavior I expect this to return a dataframe of all the water quality data for this site for the date range inputted into the function.
Screenshots instead it returns a dataframe with multiple sites listed:
Session Info Please include your session info:
Additional context Add any other context about the problem here.