Closed trevorld closed 1 year ago
I'm not sure we should really use any of the current implementations as a good reference source. They all seem kind of hand wavy, and allow parsing of non leap seconds too. Basically it seems like they all have some simple special handling of "60s".
The main thing is that with POSIXct, leap seconds are completely ignored, full stop. See ?POSIXct
:
"POSIXct" times used by R do not include leap seconds on any platform.
So really it is just a matter of what to do during parsing.
The "right" solution is to only allow 60s when parsing if you are actually on a leap second date. Then you need a way to store it, and you have to make a decision about what to do with it when converting to sys-time or naive-time (and from there, Date and POSIXct), which don't support leap seconds. <date>
includes a utc_clock
class that can handle this, and it actually parses correctly by checking against the actual leap seconds to see if it corresponds to a real leap second or not:
https://github.com/HowardHinnant/date/blob/50acf3ffd8b09deeec6980be824f2ac54a50b095/include/date/tz.h#L2022-L2032
When going from utc_clock -> sys-time / naive-time, date maps leap seconds to the nearest possible moment in time before the leap second, which is reasonable.
I may include this in the future, but leap seconds are a little complicated because they are included in the text form of the time zone database (that clock uses now) but not in the binary form of the time zone database on Mac (which we may switch to in the future for performance). So I'd have to come up with a way to deal with that.
For now I will add some docs about this in FAQ as you say
# Note that 2006 here was not a leap second year, but parsing "sort of works" anyways
format <- "%Y-%m-%d %H:%M:%S"
# POSIXlt allows 60s, but that rolls over when converting to POSIXct
lubridate::fast_strptime("2006-12-31 23:59:60", format)
#> [1] "2006-12-31 23:59:60 UTC"
lubridate::fast_strptime("2006-12-31 23:59:60", format, lt = FALSE)
#> [1] "2007-01-01 UTC"
# Can't represent 61s in POSIXlt, so lubridate rolls over even in the POSIXlt
lubridate::fast_strptime("2006-12-31 23:59:61", format)
#> [1] "2007-01-01 00:00:01 UTC"
lubridate::fast_strptime("2006-12-31 23:59:61", format, lt = FALSE)
#> [1] "2007-01-01 00:00:01 UTC"
# But it thinks this is garbage?
lubridate::fast_strptime("2006-12-31 23:59:62", format)
#> [1] NA
lubridate::fast_strptime("2006-12-31 23:59:62", format, lt = FALSE)
#> [1] NA
# POSIXlt allows 60s, rolls over when converting to POSIXct
strptime("2006-12-31 23:59:60", format, tz = "UTC")
#> [1] "2006-12-31 23:59:60 UTC"
as.POSIXct(strptime("2006-12-31 23:59:60", format, tz = "UTC"))
#> [1] "2007-01-01 UTC"
# POSIXlt can't handle 61s, so base R says this is NA
strptime("2006-12-31 23:59:61", format, tz = "UTC")
#> [1] NA
as.POSIXct(strptime("2006-12-31 23:59:61", format, tz = "UTC"))
#> [1] NA
# Rolls over for 60s, errors on 61s
nanotime::as.nanotime("2006-12-31T23:59:60Z")
#> [1] 2007-01-01T00:00:00+00:00
try(nanotime::as.nanotime("2006-12-31T23:59:61Z"))
#> Error in RcppCCTZ::parseDouble(x, fmt = format, tzstr = tz) :
#> Parse error on 2006-12-31T23:59:61Z
Created on 2023-04-21 with reprex v2.0.2.9000
Perhaps it would make sense to document how
{clock}
handles leap seconds? Perhaps in the FAQ article?I'm observing that on my computer
{clock}
parses leap seconds asNA
values (and issues a Warning) and that differences between UTC times are in POSIX seconds (instead of SI/metric seconds){clock}
NA
{nanotime}
base::as.POSIXct()
base::as.POSIXlt()
POSIXct()
)