HARPgroup / om

Object-oriented Meta-model
0 stars 1 forks source link

Error: automated run summaries dies with `window()` function in R 4.3.3 #534

Closed rburghol closed 8 months ago

rburghol commented 8 months ago

Error:

Error in `>=.POSIXt`(all.indexes, start) :
  object 'check_tzones' not found

Example code to replicate (from ssh window):

Run R command line from central R location

cd /var/www/R
R

Now execute the following R script

# once in R, enter the following:
library(stringr)
save_directory <- "/var/www/html/files/fe/plots"
#----------------------------------------------
site <- "http://deq2.bse.vt.edu/d.dh"    #Specify the site of interest, either d.bet OR d.dh
#----------------------------------------------
# Load Libraries
basepath='/var/www/R';
source(paste(basepath,'config.R',sep='/'))
save_directory <-  "/var/www/html/data/proj3/out"
library(hydrotools)
# authenticate
ds <- RomDataSource$new(site, rest_uname)
ds$get_token(rest_pw)

pid=4836336;elid=344054;runid=201
scen.propname<-paste0('runid_', runid)
# GETTING SCENARIO PROPERTY FROM VA HYDRO
sceninfo <- list(
  varkey = 'om_scenario',
  propname = scen.propname,
  featureid = pid,
  entity_type = "dh_properties",
  bundle = "dh_properties"
)
scenprop <- RomProperty$new( ds, sceninfo, TRUE)

# POST PROPERTY IF IT IS NOT YET CREATED
if (is.na(scenprop$pid) | is.null(scenprop$pid) ) {
  # create
  scenprop$save(TRUE)
} else {
  # This is a re-run so blank out props first (if they exist)
  for (pname in c('l90_RUnit', 'l90_year', 'Runit', 'Runit_boxplot_month', 'Runit_boxplot_year')) {
    null_prop <- vahydro_post_metric_to_scenprop(scenprop$pid, 'om_class_Constant', NULL, pname, NULL, ds)
  }
}

dat <- fn_get_runfile(elid, runid, site= omsite,  cached = FALSE);

syear = as.integer(min(dat$year))
eyear = as.integer(max(dat$year))
model_run_start <- min(dat$thisdate)
model_run_end <- max(dat$thisdate)
if (syear < (eyear - 2)) {
  sdate <- as.Date(paste0(syear,"-10-01"))
  edate <- as.Date(paste0(eyear,"-09-30"))
  flow_year_type <- 'water'
} else {
  sdate <- as.Date(paste0(syear,"-02-01"))
  edate <- as.Date(paste0(eyear,"-12-31"))
  flow_year_type <- 'calendar'
}
dat <- window(dat, start = sdate, end = edate);

If we change the zoo to xts, the window function no longer fails, and at least the simplest summary script (for the CBPLandDataConnectionFile) works.

dat <- as.xts(dat)
dat <- window(dat, start = sdate, end = edate)
COBrogan commented 8 months ago

Based on the error message, the problem is that sdate and edate don't have a timezone associated with them. My guess is this is due to the bug fix noted in R 4.3.3 in match e.g. ```

match(<POSIXct>, .) is correct again for differing time zones, ditto for "POSIXlt", fixing [PR#18618](https://bugs.r-project.org/show_bug.cgi?id=18618) reported by Bastian Klein.

Converting to xts may be fine. I don't know much about that, but it may raise problems elsewhere. I would suggest we instead just provide a bogus time zone to window (I pass R's default time zone below. as.POSIXct converts dates assuming they are midnight at UTC if no time is provided, so this should prevent error. Passing EST may cause dates to be wrong without adding in an arbitrary time):

if (syear < (eyear - 2)) {
  #Add timezone to sdate/edate to ensure consistency with stats::window()
  sdate <- as.POSIXct(paste0(syear,"-10-01"), tz = "UTC")
  edate <- as.POSIXct(paste0(eyear,"-09-30"), tz = "UTC")
  flow_year_type <- 'water'
} else {
  #Add timezone to sdate/edate to ensure consistency with stats::window()
  sdate <- as.POSIXct(paste0(syear,"-02-01"), tz = "UTC")
  edate <- as.POSIXct(paste0(eyear,"-12-31"), tz = "UTC")
  flow_year_type <- 'calendar'
}
dat <- window(dat, start = sdate, end = edate);

Alternatively, we can just use base R to perform the same task as window by relying on the index of the zoo dat:

dat  <- dat[as.Date(index(dat)) >= sdate & as.Date(index(dat)) <= edate]
rburghol commented 8 months ago

@COBrogan thanks a bunch -- I agree with your approach as it it will avoid unintended consequences of changing the underlying ts format from zoo to xts. Honestly, that beastly if-then code block needs to be turned into a function instead of being duplicated across 20 different R files but perhaps that's a task for another day (I think it should be in hydrotools, or in the om functions).

rburghol commented 8 months ago

@COBrogan I can confirm that the fix you provided above works as desired on the ubuntu 20.04 machine.

rburghol commented 8 months ago

Also, given that updating is painful, I am taking the extra pain of creating a function to standardize model warmup data removal, and also I am changing all print() statements into message() statements in this set of summary functions, since print() should only be used for actual model returns, not warnings and other informational stuff. At the end of each function I am also adding a print(1) to indicate completion, and return 1 to the calling script -- this will allow us to include these in automated testing routines -- honestly, that is probably a hackish way to do so and there are likely R toolkits to assert correct completion, but this will be a start.

rburghol commented 8 months ago

This appears to be all sorted out, errors are gone and a single uniform function now handles removing the warmup period.