Open lindig opened 4 years ago
The timestamp 1581667002
refers to Friday, 14 February 2020 07:56:42 GMT+00:00
. So it appears to me that the unparsing (printing) is correct but that the parsing of 2020-02-14T06:56:42Z
= 1581667002
is not correct.
Is that on master? I think on master there's some changes to make it correct in #5 which has been rebased, but it caused a (small) regression in toml. Could you try on #5?
This was not on master but on 0.2.6
. I'll try on master.
I believe this happens on master as well. Running the tests on master (on macOS) I also see a test failure:
:ISO8601.ml $ make test
Done: 48/50 (jobs: 1).........................................................F...
==============================================================================
Failure: ISO8601:2:[UTILS]:1:[UTILS mkdate]:4:583804800.
expected: 583804800. but got: 583808400.
------------------------------------------------------------------------------
Ran: 61 tests in: 0.04 seconds.
FAILED: Cases: 61 Tried: 61 Errors: 0 Failures: 1 Skip: 0 Todo: 0 Timeouts: 0.
ISO8601_TEST alias tests/runtest (exit 1)
I tried to use the branch from https://github.com/ocaml-community/ISO8601.ml/pull/5 but it is quite out of date (using ocamlbuild
). So far did not try it.
Ah my bad, what about wip-merge-pr-5
(which is the rebase of #5)?
On branch wip-merge-pr-5
the issue no longer exists. Running make test
I see some test failures.
utop # ISO8601.Permissive.datetime_tz "2020-02-14T06:56:42Z" |> fst |> ISO8601.Permissive.string_of_datetime;;
- : string = "2020-02-14T06:56:42"
Interesting, I don't see a test error on my end. I'd like to merge this at some point but am not confident enough in its correctness. Frankly, help would be very appreciated.
Tests only work for me on branch wip-merge-pr-5
when adding --profile=release
to dune runtest
This looks suspicious (on master):
src/ISO8601_lexer.mll: let offset = fst (Unix.mktime (Unix.gmtime 0. ))
This computes an offset
from the date of the epoch and applies it. For me this is
utop #
fst (Unix.mktime (Unix.gmtime 0.));;
- : float = -3600.
I don't understand what it tries to do.
well, me neither. I think it's different on #5 ?
What's the timezone of your system set to?
val mktime : tm -> float * tm
Convert a date and time, specified by the
tm
argument, into a time in seconds, as returned byUnix.time
. Thetm_isdst
,tm_wday
andtm_yday
fields oftm
are ignored. Also return a normalized copy of the giventm
record, with thetm_wday
,tm_yday
, andtm_isdst
fields recomputed from the other fields, and the other fields normalized (so that, e.g., 40 October is changed into 9 November). Thetm
argument is interpreted in the local time zone.
: tmp $ timedatectl
Local time: Thu 2020-02-20 17:02:47 GMT
Universal time: Thu 2020-02-20 17:02:47 UTC
RTC time: Thu 2020-02-20 17:02:47
Time zone: Europe/London (GMT, +0000)
System clock synchronized: yes
NTP service: inactive
RTC in local TZ: no
the Stdlib does a system call, don't see anything egregious at first sight.
mktime
is interpreted in local timezone, so this is meant to compute the offset between local timezone and UTC, and suppress this offset from the given date in order to get the UTC time.
On my computer, I am at GMT+1, do I have to add 3600
seconds to get the correct UTC time from mktime
.
$ timedatectl
Local time: Fri 2020-02-21 09:32:22 CET
Universal time: Fri 2020-02-21 08:32:22 UTC
RTC time: Fri 2020-02-21 08:32:22
Time zone: Europe/Paris (CET, +0100)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
utop # fst (Unix.mktime (Unix.gmtime 0.));;
- : float = -3600.
EDIT: In #5, I see that it has been removed. It would be nice to test the library in different time zones. It might give good results in UK but false ones in other places.
https://github.com/ocaml-community/ISO8601.ml/pull/5/files#diff-4c4ab45718b4810726f53bea3f21b0a9R6
As you will have noticed, I'm in GMT/UTC and still get the same -3600
offset.
As you will have noticed, I'm in GMT/UTC and still get the same
-3600
offset.
Yes indeed, but according to what documentation says, you should not. Unless I misunderstood the documentation.
I had asked @undu and he sees it on his system (also in GMT/UTC) as well. And given that this is coming from a system call, I am reluctant to believe that this is a bug in Unix.gmtime
.
A similar issue seems to appear during the tests of toml.5.0.0
(which uses this library) on opam-repository's CI.
- expected: title = "TOML Example"
[...]
- dob = 1979-05-27T08:32:00+00:00
[...]
-
- but got: title = "TOML Example"
[...]
- dob = 1979-05-27T07:32:00+00:00
[...]
From what I understand of the discussion it seems to be a known issue so I'm just going to disable the tests for this package.
Heard about this issue from Discord and made some comments over there, thought I'd copy the gist of my observation over here.
My speculation is that line 6 let offset = fst (Unix.mktime (Unix.gmtime 0. )) in
in ISO8601_lexer.mll
fails to take into account whether the local time zone was observing DST at the Unix epoch time, thus resulting in off-by-one-hour "base" offset calculation. This would seem to match @lindig's experience with Europe/London
time zone, which observes DST for the entirety of 1970.
A straightforward fix would be to tune said offset calculation according to whether DST was being observed at the time of Unix epoch. But this would still assume the base offset remains unchanged over the years, which might hold for contemporary periods, but I'll need to verify.
EDIT: I haven't read any of the PR's in detail, so I'm not sure if there's already a fix somewhere.
My above observation as it turns out was somewhat getting close, but not entirely correct.
I made some code to test if any of the time zones observe changed "base" offset, and a lot of them failed, which includes Europe/London
:
let l =
let open Timere in
let unix_epoch = 0L in
(* Time_zone.available_time_zones *)
[ "Europe/London" ]
|> List.map (fun name ->
let tz = Time_zone.make_exn name in
(name,
Time_zone.Raw.to_transitions tz
|> List.filter (fun ((start, end_exc), _entry) ->
unix_epoch < end_exc
)
|> List.map (fun ((start, end_exc), Time_zone.{ is_dst; offset }) ->
(* Printf.printf "start: %Ld, end_exc: %Ld, is_dst: %b, offset: %d\n" start end_exc is_dst offset; *)
offset - if is_dst then 3600 else 0
)
|> List.sort_uniq compare
)
)
|> List.filter (fun (name, offsets) -> List.length offsets > 1)
gives
[("Europe/London", [0; 3600])]
indicating two unique "base" offsets.
In other words, a "base" offset does not seem to be necessarily well defined even if we only start at Unix epoch (1970 Jan 1 00:00:00 UTC). Examining the incongruence further, we can indeed observe that Europe/London
"breaks" the link between isdst
, base offset, and actual offset during the period around 1970 (not always the case that actual offset = base offset + if isdst then 3600 else 0
):
via zdump command (lines of interest are marked with <== on the right hand side)
$ zdump -V -c 1968,1974 Europe/London
Europe/London Sun Feb 18 01:59:59 1968 UT = Sun Feb 18 01:59:59 1968 GMT isdst=0 gmtoff=0
Europe/London Sun Feb 18 02:00:00 1968 UT = Sun Feb 18 03:00:00 1968 BST isdst=1 gmtoff=3600
Europe/London Sat Oct 26 22:59:59 1968 UT = Sat Oct 26 23:59:59 1968 BST isdst=1 gmtoff=3600
Europe/London Sat Oct 26 23:00:00 1968 UT = Sun Oct 27 00:00:00 1968 BST isdst=0 gmtoff=3600 <==
Europe/London Sun Oct 31 01:59:59 1971 UT = Sun Oct 31 02:59:59 1971 BST isdst=0 gmtoff=3600 <==
Europe/London Sun Oct 31 02:00:00 1971 UT = Sun Oct 31 02:00:00 1971 GMT isdst=0 gmtoff=0
Europe/London Sun Mar 19 01:59:59 1972 UT = Sun Mar 19 01:59:59 1972 GMT isdst=0 gmtoff=0
Europe/London Sun Mar 19 02:00:00 1972 UT = Sun Mar 19 03:00:00 1972 BST isdst=1 gmtoff=3600
Europe/London Sun Oct 29 01:59:59 1972 UT = Sun Oct 29 02:59:59 1972 BST isdst=1 gmtoff=3600
Europe/London Sun Oct 29 02:00:00 1972 UT = Sun Oct 29 02:00:00 1972 GMT isdst=0 gmtoff=0
Europe/London Sun Mar 18 01:59:59 1973 UT = Sun Mar 18 01:59:59 1973 GMT isdst=0 gmtoff=0
Europe/London Sun Mar 18 02:00:00 1973 UT = Sun Mar 18 03:00:00 1973 BST isdst=1 gmtoff=3600
Europe/London Sun Oct 28 01:59:59 1973 UT = Sun Oct 28 02:59:59 1973 BST isdst=1 gmtoff=3600
Europe/London Sun Oct 28 02:00:00 1973 UT = Sun Oct 28 02:00:00 1973 GMT isdst=0 gmtoff=0
via TZ=Europe/London utop
utop # Unix.mktime (Unix.gmtime 0.);;
- : float * Unix.tm =
(-3600.,
{Unix.tm_sec = 0; tm_min = 0; tm_hour = 0; tm_mday = 1; tm_mon = 0;
tm_year = 70; tm_wday = 4; tm_yday = 0; tm_isdst = false})
So at least for Europe/London
specifically, checking for isdst
in base offset calculation would not have fixed the issue.
Namely we have a case where sampling at Unix epoch does not yield a useful base offset.
I'll add that my initial observation was based on information from https://www.timeanddate.com/time/change/uk/london where it reads "DST observed all year" for 1969 and 1970, but this doesn't seem to line up properly with the isdst
flag (unless I'm misinterpeting the outputs attached above).
A small update:
I've patched ISO8601 to use Timere on my fork at try-timere
branch https://github.com/darrenldl/ISO8601.ml/tree/try-timere ,
and it passes all 57 tests.
Timere does bring in a lot more complexity and dependencies, so I am not entirely sure if pushing toward using Timere under the hood is necessarily a good option for all users, but it doesn't seem to be a bad tradeoff if having difficult to resolve bugs is the only other option.
The actual Unix mktime
call takes 3 possible values for the tm_isdst
field: 0, 1, or -1. Looking at the OCaml bindings, the bool
value from Unix.mktime
is ignored (but at some time in the past wasn't):
tm.tm_isdst = -1; /* tm.tm_isdst = Bool_val(Field(t, 8)); */
This causes the Unix implementation to guess DST, which is almost always possible except for the hour when DST comes in or out of effect.
I believe we should not guess any offset but just accept the result of Unix.mktime
as the correct timestamp for the argument. Note that Unix.mktime
returns an updated tm
structure in addition to the timestamp.
I suspect there is some misconception about the Unix epoch at play here: On a Linux system the timestamp for Jan 01 1970 is not zero, as we (and the unit tests) expect:
lindig@pi:~ $ cat /etc/timezone
Europe/London
lindig@pi:~ $ date -d 'Jan 01 1970 00:00:00' +%s
-3600
This agrees with what I see on macOS as well. The timestamp at the epoch is not zero.
The failure of the unit test can be easily reproduced (on macOS for me, on two systems):
TZ=Europe/London make test
There is no unit test failure for TZ=US/Eastern
, TZ=US/Central
, or TZ=Europe/Berlin
. Is Europe/London
failing universally or could this be a macOS problem?
Looking at @darrenldl's comment I agree that looking at the UTC offset at timestamp 0 is not helpful because it is wrong for London and this is coming from the underlying time zone information in the OS and not the OCaml implementation.
Europe/London Sat Oct 26 23:00:00 1968 UT = Sun Oct 27 00:00:00 1968 BST isdst=0 gmtoff=3600
Europe/London Sun Oct 31 01:59:59 1971 UT = Sun Oct 31 02:59:59 1971 BST isdst=0 gmtoff=3600
Europe/London Sun Oct 31 02:00:00 1971 UT = Sun Oct 31 02:00:00 1971 GMT isdst=0 gmtoff=0
I assume I'm doing something wrong here but I am baffled by this behaviour:
The time printed is shifted by one hour from the time that was parsed. How would I prevent that?