ropensci / rnoaa

R interface to many NOAA data APIs
https://docs.ropensci.org/rnoaa
Other
330 stars 84 forks source link

add to man page for coops_search #213

Closed tphilippi closed 7 years ago

tphilippi commented 7 years ago

Could you add to the coops.html page a note that for most if not all data products, the NOAA coops API only vends up to 31 days of data at a time?

test1 <- coops_search(begin_date=20100101, 
                      end_date=20100131, 
                      station_name=9410230,
                      product="water_temperature", 
                      datum = NULL, 
                      units = "metric", 
                      time_zone = "gmt",
                      application = "NPS-rnoaa")

works, but:

test2 <- coops_search(begin_date=20100101, 
                      end_date=20101231, 
                      station_name=9410230,
                      product="water_temperature", 
                      datum = NULL, 
                      units = "metric", 
                      time_zone = "gmt",
                      application = "NPS-rnoaa")  

fails with a misleading (but correct) error message

Error: No data was found

Also, you might add a trap for this circa line 138 (not sure if this is in your exact style):

bd <- as.Date(as.character(begin_date),format="%Y%m%d")
ed <- as.Date(as.character(end_date),format="%Y%m%d")
if ((ed - bd) > 31) {
  stop(paste("Duration must be 31 days or less\nbegin_date=",begin_date,
             " end_date=",end_date," duration=",(ed-bd),"\n",sep=""), call. = FALSE)
}

At least for now I suggest adding a note to the documentation and possibly the error trapping. Adding the ability to have end_date - start_date >31 days is not simple. Generating a list of 31-day requests from start_date and end_date is easy; the returned data for each chunk can be clean or completely missing or at least 1 other pathology. If I get to the point where I'm confident I understand all of the possibilities, I may make a pull request and submit proposed code.

sckott commented 7 years ago

what is the coops.html page ?

yes, can definitely document this. do you know where the NOAA docs state this?

that block of code looks pretty good. pull request?

tphilippi commented 7 years ago

coops.html is the man page generated from rnoaa/man/coops.Rd I used that name because the man page for coops_search is not coops_search.html but rather coops.html.

The NOAA API docs mentions this in a table near the bottom ( (https://tidesandcurrents.noaa.gov/api/#maximum ) (that looks to be the same as the page you link: http://co-ops.nos.noaa.gov/api/#maximum )

If you submit a url with a long duration, datagetter returns an error message:

Range Limit Exceeded: The size limit for data retrieval for this product is 31 days

For example: https://tidesandcurrents.noaa.gov/api/datagetter?product=water_temperature&application=NPS&begin_date=20170101&end_date=20170330&station=9410230&time_zone=GMT&units=metric&interval=6&format=csv This has been true for predicted & observed tidal heights for the last 2 versions of their API, now I get it for temperature data, too.

Note that the limit differs by product, but interval=h in the API has the potential to change it for many of the products. If your ... only passes parameters to curl (and not interval= to the API), the duration limit is 1 year for hourly_height and high_low, and 10 years for daily_mean and monthly_mean, and I don't think is applicable to datums.

If I have time this weekend I'll fork the repository & extend my trapping code fragment to test durations against the product-specific maxima. If I can't this weekend, I should be able to get at it the following weekend. I won't feel insulted if you do the patch instead of waiting for me, but I also don't think it is an emergency, especially if you can add a brief note to coops.Rd in the meantime.

In the broader picture, I haven't decided whether it would be better to put my code for splitting long requests into 31 day chunks inside your functions, or leaving it as an external wrapper function that chunks the dates, calls your functions, then assembles the results. In terms of (you) maintaining your codebase in the face of changing APIs, I suspect an external wrapper might be better. What are your thoughts? Would you be interested in a function by_chunk(begin_date, end_date, chunk_size, rnoaa_fun, ...) contributed to your package, or should I just put it in my own repository of utility functions?

sckott commented 7 years ago

ah, okay, the man page for coops.

thanks for the link

If I have time this weekend ...

I'll add the note to man page, and let you do the code change - next milestone with issues is here https://github.com/ropensci/rnoaa/milestone/10 - want to submit to CRAN relatively soon, within a week or so, but deadline to go later

In the broader picture ...

I'd vote for putting it in here. If you want that feature i'm guessing others do as well. agree it's best not to break package APIs, but we could add an additional function or parameter inside the current function. i think probably a second fxn is best so we don't break behavior of the current fxn

tphilippi commented 7 years ago

Scott-- My question is whether my "chunk_it" code should go inside coops_search() and other functions, or be a single stand-alone wrapper function by_chunk(begin_date, end_date, chunk_size, rnoaa_fun, ...) that could work with more than just coops_search()? Inside each function has the advantage of transparency to the users: it just works no matter the duration requested. It has the disadvantage of potentially needing to be tweaked in multiple places if and when the NOAA APIs change their limits. I haven't looked at your code in taxize & elsewhere that chunks long requests, but its roughly the same idea (just more complex because durations are not as simple as blocks of rows). In terms of something useful that minimizes the potential for adversely affecting your code, if you don't tell me different, I'll probably write the wrapper function by_chunk().

sckott commented 7 years ago

I agree best as a separate function.

sckott commented 7 years ago

i think your PR sorted this out, clsoing, reopen if not