DOI-USGS / dataRetrieval

This R package is designed to obtain USGS or EPA water quality sample data, streamflow data, and metadata directly from web services.
https://doi-usgs.github.io/dataRetrieval/
Other
263 stars 85 forks source link

Random states getting download fails #492

Closed matthewross07 closed 5 years ago

matthewross07 commented 5 years ago

Random states failing to download WQP data

Hi! First I love this package and am grateful for your maintenance. I have been working with WQP for a few years now with the only occasional 404 error on a readWQPdata. However, now some code that used to always work is failing in ways that I don't understand. A reprex (hopefully!) is below:

The main issue

dat <- readWQPdata(statecode='Montana',
                   characteristicName = 'pH',
                   startDateLo = '1994-01-01',
                   startDateHi = '1996-01-01')

Error in getWebServiceData(obs_url, httr::write_disk(temp), httr::accept("application/zip")) : 
  Not Found (HTTP 404).

The oddest part of this is that it is consistently giving this kind of error for Colorado and Montana across a variety of characteristic names (like Copper, Cadmium, etc...). It works fine for other states (like West Virginia, Indiana, Ohio, etc...)

Things I have checked:

dat <- readWQPdata(statecode='Montana',
                   characteristicName = 'pH',
                   startDateLo = '1980-01-01',
                   startDateHi = '1992-01-01')
dat <- readWQPdata(statecode='Montana',
                   characteristicName = 'pH',
                   startDateLo = '1994-01-01',
                   startDateHi = '1996-01-01',
                   siteType = 'Stream')
dat <- readWQPdata(statecode='Montana',
                   characteristicName = 'pH',
                   startDateLo = '1994-01-01',
                   startDateHi = '1996-01-01',
                   siteType = "Lake, Reservoir, Impoundment")

My question(s)

I appreciate any advice.

limnoliver commented 5 years ago

I've recently had trouble when there are sites in the call that have a weird character in the monitoring ID, which trips up the call and gives be a 404 error. Maybe look at the site names in Montana, and see if they have weird characters? -- see some documentation here and here. One potential reason it worked in the past but not now is that a new site was added.

ldecicco-USGS commented 5 years ago

Yeah, it's failing because the siteInfo attribute is getting too big...which comes from a call to whatWQPsites. I've been meaning to do 2 things:

  1. add an argument that allows the user to turn off those attribute calls
  2. re-look at using a POST...especially when it's specific to a long list of just sites (that's not too hard...the POST becomes hard when it's a mix-match of things).

In the nearterm @matthewross07 ...if you can use the site types to limit your search, that's one option. I can get you another option after a couple of meetings this morning....

ldecicco-USGS commented 5 years ago

OK, I think I've fixed it...you can install like this:

remotes::install_github("USGS-R/dataRetrieval")
packageVersion("dataRetrieval")
[1] ‘2.7.5.9000’

I'm going to close this issue and open another one because I really want to add an argument that allows the user to turn off those attribute calls (and the POST issue is already an issue)

matthewross07 commented 5 years ago

Y'all are great and much appreciated!