ropensci / rnoaa

R interface to many NOAA data APIs
https://docs.ropensci.org/rnoaa
Other
328 stars 84 forks source link

Date limit using lcd function #330

Closed mesp9943 closed 4 years ago

mesp9943 commented 4 years ago
Session Info ```r devtools::session_info() Session info ------------------------------------------------------------------------------------------------------------------------------------------- setting value version R version 3.5.1 (2018-07-02) system x86_64, mingw32 ui RStudio (1.1.456) language (EN) collate English_United States.1252 tz America/New_York date 2019-11-04 Packages ----------------------------------------------------------------------------------------------------------------------------------------------- package * version date source assertthat 0.2.0 2017-04-11 CRAN (R 3.5.1) backports 1.1.4 2019-04-10 CRAN (R 3.5.3) base * 3.5.1 2018-07-02 local broom 0.5.2 2019-04-07 CRAN (R 3.5.3) cellranger 1.1.0 2016-07-27 CRAN (R 3.5.3) cli 1.0.1 2018-09-25 CRAN (R 3.5.1) colorspace 1.3-2 2016-12-14 CRAN (R 3.5.1) compiler 3.5.1 2018-07-02 local crayon 1.3.4 2017-09-16 CRAN (R 3.5.1) crul 0.8.4 2019-08-02 CRAN (R 3.5.3) curl 4.0 2019-07-22 CRAN (R 3.5.3) datasets * 3.5.1 2018-07-02 local devtools 1.13.6 2018-06-27 CRAN (R 3.5.1) digest 0.6.17 2018-09-12 CRAN (R 3.5.1) dplyr * 0.8.3 2019-07-04 CRAN (R 3.5.3) forcats * 0.4.0 2019-02-17 CRAN (R 3.5.3) generics 0.0.2 2018-11-29 CRAN (R 3.5.3) ggplot2 * 3.0.0 2018-07-03 CRAN (R 3.5.1) glue 1.3.0 2018-07-17 CRAN (R 3.5.1) graphics * 3.5.1 2018-07-02 local grDevices * 3.5.1 2018-07-02 local grid 3.5.1 2018-07-02 local gridExtra 2.3 2017-09-09 CRAN (R 3.5.1) gtable 0.2.0 2016-02-26 CRAN (R 3.5.1) haven 2.1.1 2019-07-04 CRAN (R 3.5.3) hms 0.4.2 2018-03-10 CRAN (R 3.5.1) hoardr 0.5.2 2018-12-02 CRAN (R 3.5.3) httpcode 0.2.0 2016-11-14 CRAN (R 3.5.0) httr 1.3.1 2017-08-20 CRAN (R 3.5.1) jsonlite 1.5 2017-06-01 CRAN (R 3.5.1) lattice 0.20-35 2017-03-25 CRAN (R 3.5.1) lazyeval 0.2.1 2017-10-29 CRAN (R 3.5.1) lubridate * 1.7.4 2018-04-11 CRAN (R 3.5.1) magrittr 1.5 2014-11-22 CRAN (R 3.5.1) memoise 1.1.0 2017-04-21 CRAN (R 3.5.1) methods * 3.5.1 2018-07-02 local modelr 0.1.5 2019-08-08 CRAN (R 3.5.3) munsell 0.5.0 2018-06-12 CRAN (R 3.5.1) nlme 3.1-137 2018-04-07 CRAN (R 3.5.1) pillar 1.4.2 2019-06-29 CRAN (R 3.5.3) pkgconfig 2.0.2 2018-08-16 CRAN (R 3.5.1) plyr 1.8.4 2016-06-08 CRAN (R 3.5.1) purrr * 0.2.5 2018-05-29 CRAN (R 3.5.1) R6 2.3.0 2018-10-04 CRAN (R 3.5.1) rappdirs 0.3.1 2016-03-28 CRAN (R 3.5.1) Rcpp 1.0.2 2019-07-25 CRAN (R 3.5.3) readr * 1.1.1 2017-05-16 CRAN (R 3.5.1) readxl 1.3.1 2019-03-13 CRAN (R 3.5.3) rgdal * 1.3-4 2018-08-03 CRAN (R 3.5.1) rlang 0.4.0 2019-06-25 CRAN (R 3.5.3) rnoaa * 0.9.2 2019-10-23 CRAN (R 3.5.3) rstudioapi 0.8 2018-10-02 CRAN (R 3.5.1) rvest 0.3.4 2019-05-15 CRAN (R 3.5.3) scales 1.0.0 2018-08-09 CRAN (R 3.5.1) sp * 1.3-1 2018-06-05 CRAN (R 3.5.1) stats * 3.5.1 2018-07-02 local stringi 1.1.7 2018-03-12 CRAN (R 3.5.0) stringr * 1.3.1 2018-05-10 CRAN (R 3.5.1) tibble * 2.1.3 2019-06-06 CRAN (R 3.5.3) tidyr * 0.8.1 2018-05-18 CRAN (R 3.5.1) tidyselect 0.2.5 2018-10-11 CRAN (R 3.5.3) tidyverse * 1.2.1 2017-11-14 CRAN (R 3.5.3) tools 3.5.1 2018-07-02 local utils * 3.5.1 2018-07-02 local withr 2.1.2 2018-03-15 CRAN (R 3.5.1) XML 3.98-1.16 2018-08-19 CRAN (R 3.5.1) xml2 1.2.0 2018-01-24 CRAN (R 3.5.1) yaml 2.2.0 2018-07-25 CRAN (R 3.5.1) > ```

I am using the below code to pull and clean temp data for select stations. My issue is the data hits a wall at 9/18. I am hoping to pull data through October 2019, which is appearing available thru NOAA's website/databases but I am unable to pull data past 9/18 using the rnoaa API lcd function.

library(rgdal)
library(rnoaa)
library('lubridate')
library(dplyr)
library(tidyverse)
library(stringr)
stations <- tribble(~stationid,~site,
                    72531403960,"Jefferson Barracks",
                    72456013996,"Topeka",
                    72446003947,"Leavenworth",
                    72450503923,"Wichita",
                    72445003945,"Columbia",
                    72531403960,"John Cochran",
                    72330003975,"Poplar Bluff",
                    72446313988,"Kansas City",
                    72433903865,"Marion")

data_raw_LCD <- map2(stations$stationid,2019,lcd)

# map2(data_raw_LCD,paste0(stations$site,".csv"),write_csv)

#write clean data
# testset <- list(data_raw_LCD[[1]],data_raw_LCD[[2]])
#need to create a dataframe out of list of lists by having same dimensions and types-
new_vector <- vector("list",length(data_raw_LCD))
for(i in 1:length(data_raw_LCD)){
new_vector[[i]] <- data_raw_LCD[[i]] %>% select(`station`,`date`,`tmp`) 

}
new_vector2 <- bind_rows(new_vector) %>% 
  filter(tmp!="+9999,9") %>% 
  mutate(tmp2=str_split(tmp,"\\,",simplify=TRUE) %>% .[,1],
         tmpf=(as.numeric(tmp2)/10*(9/5))+32,
         quality_code=str_split(tmp,"\\,",simplify=TRUE) %>% .[,2],
         date_clean=str_split(date,"T",simplify=TRUE) %>%.[,1],
hour_0=str_split(date,"T",simplify=TRUE) %>%.[,2]) %>% 
  filter(quality_code!=2,3,6,7) %>% 
  mutate(Timestamp=paste0(date_clean," ",hour_0),
         Timestamp=round_date(ymd_hms(Timestamp),"hour")) %>% 
  group_by(Timestamp,station) %>% 
  summarize(tmp_hr_avg=mean(tmpf)) %>% 
  left_join(stations,by=c("station"="stationid"))
sckott commented 4 years ago

thanks @mesp9943 for the question.

with your script the latest dates i'm getting are 2019-11-02 for every stationid

So if you run arrange(new_vector2, desc(Timestamp)) at the end of that script what do you get?

mesp9943 commented 4 years ago

image

mesp9943 commented 4 years ago

maybe it's a package dependency version issue? I'm running out of ideas...

sckott commented 4 years ago

my next guess is a caching issue. so we cache requests in many fxns in rnoaa with the thinking that users often make the same requests over and over again, so might as well speed those up. however, this means that you can be using old data depending on the last time you requested the same data.

get the path for the lcd cache with rnoaa:::lcd_cache$cache_path_get(), you should be able to see the last modified times for those files, and I expect that you'll have files in there with dates closer to the last dates you're getting above

sckott commented 4 years ago

(note to self: improve lcd docs, inform users about caching and how to inspect their cached files)

mesp9943 commented 4 years ago

Okay yeah the files in the cache path you mention are old. Do I just delete them manually?

sckott commented 4 years ago

yeah you can do it manually, or use that lcd cache object, lcd_cache is an object from the hoardr pkg, see ?hoardr::hoard for help on it's methods. with lcd_cache$delete you can delete individual files, and lcd_cache$delete_all() with nuke em, delete all of them.

opened an issue to try to make caching behavior more transparent https://github.com/ropensci/rnoaa/issues/331

mesp9943 commented 4 years ago

i loaded the hoardr package but it doesn't seem to register lcd_cache?

mesp9943 commented 4 years ago

@sckott image

sckott commented 4 years ago

lcd_cache is created from calling hoardr::hoard() - see https://github.com/ropensci/rnoaa/blob/master/R/onload.R#L20-L22 - lcd_cache is not exported to the user, but you can get to it via triple namespace :::

mesp9943 commented 4 years ago

Thank you! That worked.