ropensci / rnoaa

R interface to many NOAA data APIs
https://docs.ropensci.org/rnoaa
Other
330 stars 84 forks source link

Find all `coops` stations #227

Closed michaeljakob closed 7 years ago

michaeljakob commented 7 years ago

I have found the function coops_search that takes a station parameter, but I was unable to find a function that lists all available stations. How do you download a list with all available stations?

Examples:

aa <- coops_search(station_name = 8723970, begin_date = 20140927, end_date = 20140928, product = "water_temperature")
sckott commented 7 years ago

thanks for your question @michaeljakob - I don't know off hand

@tphilippi @jsta any thoughts on how to get a list of all CO-OPS stations?

jsta commented 7 years ago

Look like this info is not exposed through the API. It says the following:

Station listings for various products can be viewed at https://tidesandcurrents.noaa.gov or viewed on a map at Tides & Currents Station Map

However, there is a text list at https://www.tidesandcurrents.noaa.gov/stations.html?type=Water+Levels which could be parsed.

tphilippi commented 7 years ago

I now deal with the NOAA ACIS API instead of cdo-web, but the bottom of: https://www.ncdc.noaa.gov/cdo-web/webservices/v2#stations seems to give an example for fetching all stations, or restricted by FIPS. I don't know if datasetid=COOP would do the trick

The ACIS equivalent ( http://data.rcc-acis.org/StnMeta see http://www.rcc-acis.org/docs_webservices.html) also includes bounding box queries, but doesn't allow combining station type with other parameters, so you'd have to grab everything within a box or FIPS, then subset by type, or grab all COOPS (and cache the table), then query it by bounding box (probably not FIPS). [Also, ACIS doesn't require tokens, so it is a bit easier to set up & test.]

sckott commented 7 years ago

thanks for the replies @jsta @tphilippi

right, we have a fxn for searching the NCDC API stations route, but I think @michaeljakob wants to download all stations in one go, yes?

@michaeljakob if yo use ncdc_stations, you can page through results like ncdc_stations(page = 1, limit = 1000), and so on

Thanks for pointing out the list @jsta - though I'd prefer not to scrape since it could break whenver NOAA changes html on that page

michaeljakob commented 7 years ago

I tried this today and was surprised to find that not a single station (of 1000) was measuring water temperature. This made me think if my code (5 lines) were correct. Could you please confirm this?


x = ncdc_stations(page = 1, limit = 1000)
y = x$data
y$cid = substr(y$id, 6, 99)
y$cid = as.numeric(y$cid)

# check if there's water data on 2014/09/28 at any of the 1000 stations
for (i in 1:1000) { y$t1[[i]]= tryCatch(coops_search(station_name = y$cid[[i]], begin_date = 20140927, end_date = 20140928, product = "water_temperature"), error = function(e) NA) }

y[!is.na(y$t1),] returns 0.

tphilippi commented 7 years ago

My bad! I saw coops and thought NCDC COOP and not tides & currents CO-OPS, and then misled Scott, too.

If you want CO-OPS tides & temperatures data, you can get an xml file of all stations from: https://opendap.co-ops.nos.noaa.gov/stations/stationsXML.jsp The file includes lat & long and state, establishment date, plus lists of parameters. If I had more coffee in me I'd write 3 lines to grab that file and parse it into a dataframe, but I'm not sure rOpenSci's approach to external package dependencies (XML), and I've clearly done enough damage on this issue already.

michaeljakob commented 7 years ago

Hey @tphilippi @sckott thank you so much! That works really smoothly now :)