ropensci / rnoaa

R interface to many NOAA data APIs
https://docs.ropensci.org/rnoaa
Other
330 stars 84 forks source link

Break out function to read .dly files directly. #223

Closed shabbychef closed 6 years ago

shabbychef commented 7 years ago

Some users (err, me) might prefer to download the .dly files directly from the NOAA ftp site, rather than rely on e.g. ghcnd_GET. Thus it would make sense to strip out the functionality here: https://github.com/ropensci/rnoaa/blob/master/R/ghcnd.R#L459-L472 into a separate function.

shabbychef commented 7 years ago

Or possibly just use this:

read_dly <- function(fname) {
  require(readr)

# widths and variable names
  vars <- c("id","year","month","element",as.character(outer(c("VALUE","MFLAG","QFLAG","SFLAG"),1:31,FUN="paste0")))
  wids <- readr::fwf_widths(c(11,4,2,4,rep(c(5,1,1,1), 31)),col_names=vars)

# column types; most are char, but force these to be int:
  col_t <- readr::cols(
    year = col_integer(), 
    VALUE1 = col_integer(), VALUE2 = col_integer(), VALUE3 = col_integer(), VALUE4 = col_integer(),
    VALUE5 = col_integer(), VALUE6 = col_integer(), VALUE7 = col_integer(), VALUE8 = col_integer(),
    VALUE9 = col_integer(), VALUE10 = col_integer(), VALUE11 = col_integer(), VALUE12 = col_integer(),
    VALUE13 = col_integer(), VALUE14 = col_integer(), VALUE15 = col_integer(), VALUE16 = col_integer(),
    VALUE17 = col_integer(), VALUE18 = col_integer(), VALUE19 = col_integer(), VALUE20 = col_integer(),
    VALUE21 = col_integer(), VALUE22 = col_integer(), VALUE23 = col_integer(), VALUE24 = col_integer(),
    VALUE25 = col_integer(), VALUE26 = col_integer(), VALUE27 = col_integer(), VALUE28 = col_integer(),
    VALUE29 = col_integer(), VALUE30 = col_integer(), VALUE31 = col_integer(),
    .default = col_character())

# read it
  df <- readr::read_fwf(fname, wids, na=c("-9999"), col_types=col_t)
}
sckott commented 7 years ago

So sorry about the delay @shabbychef - i'm usually not this bad - thanks for opening the issue!

So to be clear, what interface are you looking for? (is fname a file path, or a URL?) That in which you already have the file on disk? Or a URL? Or is your proposed fxn an internal method, and you'd want the same interface as ghcnd() but instead get the .dly file?

regarding readr - i probably don't want to introduce another dependency

shabbychef commented 7 years ago

Yes, the imagined use case is that I have already downloaded the .dly file, and wish to read it. It seems that the functionality to read a file already exists within ghcnd, but could be exposed to a user (err, me). As I wrote in my followup, I replicated the read functionality using readr, so this is not exactly a pressing issue for me.

sckott commented 7 years ago

@shabbychef i'll play around and see if it makes sense to add another fxn for use cases where people have either downloaded dly file or a URL already - so reopening for now

sckott commented 6 years ago

@shabbychef added a new function ghcnd_read to the pkg, reinstall and let me know what you think. Didn't use readr since I don't want to intorduce another dependency

shabbychef commented 6 years ago

fair enough re: readr. I'll check it out.