tidyverse / lubridate

Make working with dates in R just that little bit easier
https://lubridate.tidyverse.org
GNU General Public License v3.0
734 stars 207 forks source link

floor_date always rounds to the nearest hour #925

Open zxqs opened 4 years ago

zxqs commented 4 years ago

floor_date always rounds to the nearest hour irrespective of the provided duration. The provided example rounds the hour to "13:00" where the way I understand rounding it should round to "13:20", is my understanding correct?

floor_date(ymd_hms("2020-10-10 13:22:00"), "40M")
vspinu commented 4 years ago

@zxqs You probably think rounding is done from the day start, but that's no the case. If you round with unit minutes then only the minute components are rounded. That is, 22 %/% 40 = 0.

What you need would be implemented using units =0.6H`, but as of today fractional rounding is not implemented. It's rather complex and the use case is not that clear.

If you really need this then I think the only option right now is to do it yourself like this:

> x <- ymd_hms("2020-01-01 13:22:00")
> x - as.numeric(local_time(x)) %% (40*60)
[1] "2020-01-01 13:20:00 UTC"
vspinu commented 4 years ago

A general solution might be to add a new argument to rounding functions - origin_unit. By default origin unit is the higher unit of the rounding, minute -> hour, hour->day, day->month.

DavisVaughan commented 3 years ago

In clock, flooring with daily and sub-daily precisions works as the OP expected (I believe). It uses a default origin of 1970-01-01 in the time zone of the input and groups into rolling buckets of 40 minutes since that point. The origin is customizable.

library(clock)

# Start the counter from 1970-01-01 00:00:00,
# and buckets every group of 40 minutes after that
date_floor(
  date_time_parse("2020-10-10 13:22:00", "America/New_York"), 
  precision = "minute", 
  n = 40
)
#> [1] "2020-10-10 13:20:00 EDT"

origin <- date_time_parse("2020-10-10 13:00:00", "America/New_York")

date_floor(
  date_time_parse("2020-10-10 13:22:00", "America/New_York"), 
  precision = "minute", 
  n = 40,
  origin = origin
)
#> [1] "2020-10-10 13:00:00 EDT"

Created on 2021-05-25 by the reprex package (v1.0.0)