DOI-USGS / dataRetrieval

This R package is designed to obtain USGS or EPA water quality sample data, streamflow data, and metadata directly from web services.
https://doi-usgs.github.io/dataRetrieval/
Other
256 stars 85 forks source link

readWQPqw retrieves a mixed table of environmental and QA data-- how to retrieve only environmental sample data #712

Open jmusgs opened 1 week ago

jmusgs commented 1 week ago

I am in the process of converting some web pages that use readNWISqw to the WQP. So am I to understand correctly, that with a basic retrieval, you get all sample types including QA data? Is there any command or option to only get sample data and not QA data in your retrieval. I think it's problematic that they are all mixed in the same file, just because many users won't check.

library(dataRetrieval)
Cl_sites <- c("01193500", "01208500", "01196500", "01192500")
parameterCd <- "00940"

qwdata <- readWQPqw(paste0("USGS-", Cl_sites), parameterCd)

Expected behavior Retrieve only routine sample data (environmental samples only, with no QA data) Screenshots Note below sample types retrieved unique(qwdata$Activity_TypeCode) [1] "Sample - Routine, regular"
[2] "Quality Control Sample - Other QC, blind"
[3] "Sample - Routine, with an associated replicate"
[4] "Quality Control Sample - Other QC, reference material"

lstanish-usgs commented 1 week ago

Hi @jmusgs and thanks for the feedback. At this time we do not have a simple way to filter out QA/QC data prior to downloading the data, which is not a query argument currently available from the web service. Users who have relied on the WQP services are likely familiar with the existence of QA/QC data, however it's a good reminder that there will be new users of the WQP services that may not be aware. For now, we can update our documentation with examples of filtering QA/QC data. We'll investigate the feasibility of adding an argument to filter out QC data after downloading. In the interim, you can filter the QC data out with some additional code, here's one approach: qwdata_qc_clean <- qwdata[grep("(?i)routine", qwdata$Activity_TypeCode), ]