Closed muschellij2 closed 4 years ago
Thanks for the report. Look like a bug. I will investigate.
For floor and ceiling on date you should use floor_date
and ceiling_date
. Those are fairly well tested by now.
I cannot reproduce this.
> (dt <- .POSIXct(1435932005.80171))
[1] "2015-07-03 16:00:05 CEST"
> second(dt) <- floor(second(dt))
> dt
[1] "2015-07-03 16:00:05 CEST"
What version of lubridate is this and what OS?
Why did you report Sys.timzeone()
in the first place? It doesn't seem relevant. Or if you setSys.setenv(TZ = "America/New_York")
does the problem go away?
The Sys.timzeone()
showed that TZ
was not set and none of the other areas (such as /etc/timezone
) were not setting the time zone in any places. This was on one of our RedHat (2.6.32-696.18.7.el6.x86_64
) servers that we were working on.
If TZ
is set to a valid time zone, then it goes away. I can't reproduce the error on the cluster machine as we fixed the links to those files.
If TZ
is set to something off, then things do break down a bit:
Sys.setenv("TZ" = "not_valid_timezone")
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
xdt = dt = structure(
c(
1435932005.75171,
1435932005.76171,
1435932005.77171,
1435932005.78171,
1435932005.79171,
1435932005.80171
),
class = c("POSIXct",
"POSIXt"),
tzone = ""
)
dt
#> Warning in as.POSIXlt.POSIXct(x, tz): unknown timezone 'not_valid_timezone'
#> [1] "2015-07-03 14:00:05 GMT" "2015-07-03 14:00:05 GMT"
#> [3] "2015-07-03 14:00:05 GMT" "2015-07-03 14:00:05 GMT"
#> [5] "2015-07-03 14:00:05 GMT" "2015-07-03 14:00:05 GMT"
class(dt)
#> [1] "POSIXct" "POSIXt"
tz(dt)
#> [1] "not_valid_timezone"
second(dt)
#> [1] 5.75171 5.76171 5.77171 5.78171 5.79171 5.80171
class(second(dt))
#> [1] "numeric"
second(dt) = floor(second(dt))
#> Error in (function (dt, year, month, yday, mday, wday, hour, minute, second, : Invalid timezone of input vector: "not_valid_timezone"
But again - this is a different issue.
I guess the RedHat uses something older than lubridate v1.7.0, right? If so, those things don't apply any longer because the underlying (update) functionality has been revamped and based on CCTZ library in 1.7.0.
The second error is kind of expected because tz=""
is an alias to the system time-zone which is, in this case, invalid.
I've encountered a similar issue (lubridate v1.7.1) in a docker container (rocker/shiny, but with modified timezone settings). On-the-fly timezone mapping in docker containers is not so trivial (set $TZ, check and map /etc/localtime, /etc/timezone). We ended up with a somewhat inconsistent timezone:
Sys.time()
[1] "2018-01-30 11:11:21 CET"
now()
[1] "2018-01-30 11:11:27 CET"
Sys.timezone()
[1] "UTC"
now() - minutes(1)
[1] "2018-01-30 12:11:07 CET"
The point is that this one hour shift comes quite unexpected. I would not expect any timezone conversion when subtracting/adding fixed times.
Tested in RStudio and Shiny Server, which do not import $TZ. Instead of checking /etc/timezone (which contains the intended timezone string), the symlink /etc/localtime pointing to /usr/share/zoneinfo/Etc/UTC (with replaced contents) was analyzed in Sys.timezone().
The point is that this one hour shift comes quite unexpected. I would not expect any timezone conversion when subtracting/adding fixed times.
It looks like a bug which probably has to do with the conflicting timezone settings. Why are your timestamps printed in CET if you have Sys.timezone() returning "UTC"?
What is the value of Sys.getenv("TZ")?
> unclass(now()
[1] 1517314542
attr(,"tzone")
[1] ""
Internal lubridate code assumes system timezone when it sees "".
In order to have a better understanding of what's going on, could you please provide the output of the following:
Sys.timezone()
Sys.getenv("TZ")
tt <- now()
unclass(tt)
tz(tt)
(tt2 <- lubridate:::update_date_time(tt, seconds = 5))
tz(tt2)
unclass(tt2)
tt3 <- tt
second(tt3) <- 5
tt3
unclass(tt3)
tt4 <- tt
minute(tt4) <- 5
tt4
unclass(tt4)
tt5 <- (tt - minutes(1))
unclass(tt5)
> now()
[1] "2018-01-30 14:38:43 CET"
>
> Sys.timezone()
[1] "UTC"
> Sys.getenv("TZ")
[1] ""
> tt <- now()
> unclass(tt)
[1] 1517319524
attr(,"tzone")
[1] ""
> tz(tt)
[1] ""
> (tt2 <- lubridate:::update_date_time(tt, seconds = 5))
[1] "2018-01-30 14:38:05 CET"
> tz(tt2)
[1] ""
> unclass(tt2)
[1] 1517319485
> tt3 <- tt
> second(tt3) <- 5
> tt3
[1] "2018-01-30 15:38:05 CET"
> unclass(tt3)
[1] 1517323085
attr(,"tzone")
[1] ""
> tt4 <- tt
> minute(tt4) <- 5
> tt4
[1] "2018-01-30 15:05:43 CET"
> unclass(tt4)
[1] 1517321144
attr(,"tzone")
[1] ""
> tt5 <- (tt - minutes(1))
> unclass(tt5)
[1] 1517323064
attr(,"tzone")
[1] ""
The CET comes from an internal conversion to POSIXlt:
> unclass(now())
[1] 1517321219
attr(,"tzone")
[1] ""
> unclass(.Internal(as.POSIXlt(now(), "")))
$sec
[1] 59.49045
$min
[1] 6
$hour
[1] 15
$mday
[1] 30
$mon
[1] 0
$year
[1] 118
$wday
[1] 2
$yday
[1] 29
$isdst
[1] 0
$zone
[1] "CET"
$gmtoff
[1] 3600
attr(,"tzone")
[1] "" "CET" "CEST"
Most likely, it's using the contents of /etc/timezone or /etc/localtime (not the path returned by readlink -f).
Yerh. Looks like a bug in R. The docs for as.POSIXlt
state clearly "" is the current time zone and the doc of Sys.timezone
says ‘Sys.timezone’ returns the name of the current time zone. So these two should coincide, but they don't on your system. Would you mind reporting it to R folks?
For now, looks like Sys.setenv(TZ="UTC")
would fix the problem. I am not sure I can do much on the lubridate side for now, but I have in plan to drop reliance on as.POSIXlt in the near future.
BTW, there have been recent changes in R-devel regarding TZ settings and caching. May be that the issues is already resolved.
If the TZ environment variable is set when date-time functions are first used, it is recorded as the session default and so will be used rather than the default deduced from the OS if TZ is subsequently unset.
Sys.timezone() on a Unix-alike caches the value at first use in a session: inter alia this means that setting TZ later in the session affects only the current time zone and not the system one. Sys.timezone() is now used to find the system timezone to pass to the code used when R is configured with --with-internal-tzcode. ... Sys.timezone() tries more heuristics on Unix-alikes and so is more likely to succeed (especially on Linux). For the slowest method, a warning is given recommending that TZ is set to avoid the search.
Ok, thanks for checking
Actually your issue pops up on R-devel by "design". Sys.timezone has been changed to return cached system timezone no matter what.
Timezone of the machine has no automatic timezone
Using
Sys.timezone
, we see that this isNA
on one of the machines we work on:Example data with empty timezone
Here we have some date/times with no timezone initialized:
Here we reset the seconds to round it down:
Now, the
dt
object still has an empty timezone:but the
dt
object has the hour changed:Is this expected behavior?
Timezone set, all is good
When a timezone is set, the hours do not change when the seconds are changed