tidyverts / tsibble

Tidy Temporal Data Frames and Tools
https://tsibble.tidyverts.org
GNU General Public License v3.0
530 stars 49 forks source link

Argument 'roll' is deprecated. Deprecated in version '1.8.4'. #291

Closed danielvartan closed 8 months ago

danielvartan commented 1 year ago

Hi there,

The tsibble package is giving a warning related to a deprecated argument (roll) in the lubridate package. According to the warning, the roll argument was deprecated in version 1.8.4.

You can reproduce this warning with the code below. Please note that you first need to install the actverse package.

# install.packages("remotes")
remotes::install_github("gipso/actverse")
file <- actverse::get_from_zenodo(
    doi = "10.5281/zenodo.4898822", path = tempdir(),
    file = "processed.txt"
    )
#> ℹ Downloading metadata✔ Downloading metadata [1.2s]
#> ℹ Downloading file✔ Downloading file [18ms]
#> ℹ Checking file integrity✔ Checking file integrity [32ms]

data <- actverse::read_acttrust(file, tz = "America/Sao_Paulo")
#> ℹ Reading data✔ Reading data [605ms]
#> ℹ Tidying data✔ Tidying data [1.7s]
#> ℹ Validating data                  ℹ Found 2 gap in the time series: 2021-04-26 03:14:00/2021-04-26 03:14:00 and 2021-05-01 17:34:00/2021-05-01 17:34:00 (showing up to a total of 5 values).
#> ℹ Validating data                  ℹ Found 21 offwrist blocks in the time series. All values were set as NA.
#> ℹ Validating data✔ Validating data [47.4s]

tsibble::filter_index(data, "2021-04")
#> Warning: Argument 'roll' is deprecated. Deprecated in version '1.8.4'.

#> Warning: Argument 'roll' is deprecated. Deprecated in version '1.8.4'.
#> # A tsibble: 9,826 x 17 [1m] <America/Sao_Paulo>
#>    timestamp             pim   tat   zcm orienta…¹ wrist…² exter…³ light ambie…⁴
#>    <dttm>              <dbl> <dbl> <dbl>     <dbl>   <dbl>   <dbl> <dbl>   <dbl>
#>  1 2021-04-24 04:14:00  7815   608   228         0    26.9    24.6  3.58    1.45
#>  2 2021-04-24 04:15:00  2661   160    64         0    27.2    25.1  5.23    2.12
#>  3 2021-04-24 04:16:00  3402   243    80         0    27.7    25.5  3.93    1.59
#>  4 2021-04-24 04:17:00  4580   317   125         0    27.9    25.8  4.14    1.68
#>  5 2021-04-24 04:18:00  2624   255    33         0    28.0    25.9  3.16    1.28
#>  6 2021-04-24 04:19:00  3929   246   105         0    28.1    26.1  3.63    1.47
#>  7 2021-04-24 04:20:00  5812   369   171         0    28.2    26.4 11.5     4.67
#>  8 2021-04-24 04:21:00  3182   270    54         0    28.4    26.7  2.4     0.97
#>  9 2021-04-24 04:22:00  6362   373   189         0    28.6    26.9  3.28    1.33
#> 10 2021-04-24 04:23:00  2621   159    64         0    28.7    27.1  2.97    1.2 
#> # … with 9,816 more rows, 8 more variables: red_light <dbl>, green_light <dbl>,
#> #   blue_light <dbl>, ir_light <dbl>, uva_light <dbl>, uvb_light <dbl>,
#> #   event <dbl>, state <dbl>, and abbreviated variable names ¹​orientation,
#> #   ²​wrist_temperature, ³​external_temperature, ⁴​ambient_light

Created on 2022-11-12 with reprex v2.0.2

danielvartan commented 1 year ago

This warning is related to the last lubridate update (version 1.9.0).

roll argument to updating and time-zone manipulation functions is deprecated in favor of a new roll_dst parameter.

danielvartan commented 1 year ago

I identified the warning trigger. It comes from the POSIXct method of start_window() and end_window().

start_window.POSIXct <- function(x, y = NULL, ...) {
  if (is_null(y)) {
    min(x)
  } else {
    abort_not_chr(y, class = "POSIXct")
    assertTime(y)
    y <- utctime(y, tz = "UTC")
    force_tz(y, tz(x), roll = TRUE)
  }
}

end_window.POSIXct <- function(x, y = NULL, ...) {
  if (is_null(y)) {
    max(x) + period(1, "second")
  } else {
    abort_not_chr(y, class = "POSIXct")
    assertTime(y)

    lgl_date <- nchar(y) > 7 & nchar(y) < 11
    lgl_yrmth <- nchar(y) < 9 & nchar(y) > 4
    lgl_yr <- nchar(y) < 5
    y <- utctime(y, tz = "UTC")
    y <- force_tz(y, tz(x), roll = TRUE)
    if (any(lgl_date)) {
      y[lgl_date] <- y[lgl_date] + period(1, "day")
    }
    if (any(lgl_yrmth)) {
      y[lgl_yrmth] <- rollback(
        y[lgl_yrmth] + period(1, "month"),
        roll_to_first = TRUE
      )
    }
    if (any(lgl_yr)) {
      y[lgl_yr] <- y[lgl_yr] + period(1, "year")
    }
    lgl_time <- !(lgl_date | lgl_yrmth | lgl_yr)
    y[lgl_time] <- y[lgl_time] + 1
    y
  }
}

Source: https://github.com/tidyverts/tsibble/blob/main/R/filter-index.R#L196

danielvartan commented 1 year ago

The deprecated roll argument had the following description:

#' @param roll logical. If TRUE, and `time` falls into the DST-break, assume
#'   the next valid civil time, otherwise return NA. See examples.

Source: https://github.com/tidyverse/lubridate/blob/a5cec99d52062ab6f11303229c8945ade635ee7e/R/time-zones.r#L66

Example:

# DST start for time zone "America/New_York" in 2010 ("skipped" transition)
# DST-break: 2010-03-14 02:00:00--2010-03-14 02:59:59

ymd_hms("2010-03-14 01:59:59", tz = "America/New_York")
#> [1] "2010-03-14 01:59:59 EST"
ymd_hms("2010-03-14 02:00:00", tz = "America/New_York")
#> [1] NA
ymd_hms("2010-03-14 03:00:00", tz = "America/New_York")
#> [1] "2010-03-14 03:00:00 EDT"

# DST end for time zone "America/New_York" in 2010 ("repeated" transition")
# Rollback: 2010-11-07 02:00:00 -> 2010-11-07 01:00:00 
library(lubridate)

x <- ymd_hms("2010-03-14 02:30:00", tz = "UTC")
force_tz(x, "America/New_York", roll = FALSE)
#> [1] NA
force_tz(x, "America/New_York", roll = TRUE)
#> [1] "2010-03-14 03:00:00 EDT"

See: Clock changes in New York, New York, USA. See: How does Daylight Saving Time work?.

The roll argument have been replaced with roll_dst.

#' @param roll deprecated, same as `roll_dst` parameter.

Source: https://github.com/tidyverse/lubridate/blob/main/R/time-zones.r#L74

roll_dst comes from the timechange package and have the following description.

##' @param roll_dst is a string vector of length one or two. When two values are
##'   supplied they specify how to roll date-times when they fall into "skipped" and
##'   "repeated" DST transitions respectively. A single value is replicated to the
##'   length of two. Possible values are:
##'
##'     * `pre` - Use the time before the transition boundary.
##'     * `boundary` - Use the time exactly at the boundary transition.
##'     * `post` - Use the time after the boundary transition.
##'     * `xfirst` - crossed-first: First time which occurred when crossing the
##'        boundary. For addition with positive units pre interval is crossed first and
##'        post interval last. With negative units post interval is crossed first, pre -
##'        last. For subtraction the logic is reversed.
##'     * `xlast` - crossed-last.
##'     * `NA` - Produce NAs when the resulting time falls inside the problematic interval.

Source: https://github.com/vspinu/timechange/blob/main/R/addition.R#L17

Hence, if we want to mimic the behavior of roll = TRUE, we should use roll_dst = c("boundary", "post") .

Example:

library(lubridate)

# DST start for time zone "America/New_York" in 2010 ("skipped" transition)
# DST-break: 2010-03-14 02:00:00--2010-03-14 02:59:59

x <- ymd_hms("2010-03-14 02:30:00", tz = "UTC")
force_tz(x, "America/New_York", roll = TRUE)
#> [1] "2010-03-14 03:00:00 EDT"
force_tz(x, "America/New_York", roll_dst = c("boundary", "post"))
#> [1] "2010-03-14 03:00:00 EDT"

# DST end for time zone "America/New_York" in 2010 ("repeated" transition")
# Rollback: 2010-11-07 02:00:00 -> 2010-11-07 01:00:00 

x <- ymd_hms("2010-11-07 01:30:00", tz = "UTC")
force_tz(x, "America/New_York", roll = TRUE)
#> [1] "2010-11-07 01:30:00 EST"
force_tz(x, "America/New_York", roll_dst = c("boundary", "post"))
#> [1] "2010-11-07 01:30:00 EST"
danielvartan commented 1 year ago

I'm preparing a pull-request with this change.