UTC seems to be treated as canonical; hazardous in the presence of leap seconds

eternaleye commented 9 years ago

Chrono seems to treat UTC as the canonical/internal time representation, but this is hazardous because UTC includes leap seconds.

In your Jan 13 rustlog you bring up four issues regarding time zones:

There is no reliable way to handle the local date in the future.
There can exist a local date which occurred in two or more instants.
There can exist a local date which never has been occurred.
The conversion process itself is seriously annoying and easy to make a mistake.

The problem with UTC is that all four of those are true of it as well:

Future leap seconds can be declared at any time, invalidating calculations of future datetimes
Leap seconds as they have occurred so far result in a second that happens twice
Leap seconds can (and eventually will) go in the other direction, resulting in seconds that never happen
Leap second handling is oh so painfully annoying and easy to get wrong

I would suggest instead using TAI, which is the basis of UTC. (UTC is TAI with leap seconds)

lifthrasiir commented 9 years ago

UTC is treated as canonical in Chrono since it is treated as canonical in most OSes. I cannot come up with a single example of using TAI as canonical, other than very experimental Unununium. If you know a reliable OS API to get leap seconds, please let me know.

I should also note that some of your points are not very relevant to Chrono. In particular, an ambiguity of local date is concerning solely because it makes a human-readable date and time ambiguous contrary to most people's expectation (and they don't use UTC/TAI directly anyway). In the human-readable form there is no ambiguity with positive leap seconds (2015-06-30T23:59:60Z, 2015-07-01T08:59:60+09:00); it is just the tv_sec field which gets this wrong.

See also #22.

eternaleye commented 9 years ago

On recent Linux, there's a CLOCK_TAI clockid to get TAI time directly, but overall obtaining leap seconds doesn't require an OS API - tzdata contains a database of leap seconds, and while that is generally installed on Linux systems, it can also be trivially brought in at build time.

EDIT: Incidentally, that's the approach Haskell takes - a Cabal package that combines an API to deal with them and the bundled data itself.

lifthrasiir commented 9 years ago

@eternaleye I didn't know that CLOCK_TAI is now natively supported! (Relevant Kernel commit.) Still, it is a very new thing (supported since 3.10?).

Depending on tzdata is of course a practical choice, and it's #10 and #23. Still, one cannot guarantee if the up-to-date tzdata is available to the target system, and the build-time tzdata is prone to be obsoleted (but better than nothing IMO).

Also, as far as I know, Windows does not support the full table of leap seconds so far. (It does have an imminent leap second status however.) That means we cannot do the cross-platform TAI handling without up-to-date external tables, and it sounds very risky to make it canonical.

Ekleog commented 1 year ago

Now that quite some time has passed, and TAI support should hopefully be quite present, has anyone looked into at least integrating TAI support in chrono? It'd probably simplify quite a bit issues like #944 . It could end up being a quite significant refactor that said, especially if leap seconds were to be moved out to chrono-tz because they're fundamentally the same as timezone shifts.

Also, with chrono 0.5 in the works, I'm thinking now may be a good time to make the basis of all time be TAI in chrono, as it'd mean users would no longer have to care about leap seconds in any way so long as they're handling TAI timepoints.

WDYT?

djc commented 1 year ago

@Ekleog it sounds interesting at least; If you want to pursue it I'd be happy to review. At this point I'm not sure I understand the issues well enough that I'd be in favor of making TAI the default in chrono, but adding it as an option sounds great.

scottlamb commented 1 year ago

I'd love for my applications to use TAI, but IIUC on most e.g. Linux systems, the system clock is in UTC (maybe with leap seconds, maybe with smears), and I don't think there's any reliable source of translation tables. So how would this work? I understand that after 2035 there aren't going to be any more leap seconds and this will be basically a historical problem, but until then, I'm not sure how using TAI would work...

Ekleog commented 1 year ago

So the way I see it, the idea would be to basically copy the translation tables from the hifitime crate; and use them to convert between UTC and TAI whenever needed. I don't think chrono needs to support the more advanced features of this crate like support for other leap seconds basis, as chrono would just need a no-leap-second source-of-truth for timepoints.

Also, at least on Linux, there's supposed to be a CLOCK_TAI parameter that should allow taking Tai::now() without having to do the (currently) log2(28) timestamp comparisons that are required to convert between TAI and UTC.

The leap second database would be handled similarly to the tzdata for chrono-tz, though maybe manual handling might be enough given there are currently literally very few such seconds?

@djc No promises, but I can try to find time to implement this and submit it for chrono 0.5 :) The API as I see it would be a TimePoint struct, that counts the number of… probably doing like hifitime and taking a u64 nanoseconds and a u32 century offset from 1900 would be best?

And then once that's done, there would be two ways forward for actually integrating with chrono:

Conversion between DateTime<Tz> and TimePoint would handle leap seconds, and panic if starting from an invalid DateTime. The main question in this case is, what should DateTime<Tz> + TimeDelta(1 second) do on a leap second boundary? My thinking is it should return the time within the leap second, but then it means that we'd need to round-trip to TimePoint each time.
The backend of DateTime<Tz> would actually become TimePoint, and so DateTime would become unable to represent datetimes with an invalid leap second. This would be my preferred choice (more invariants yay), but it does mean that we'd need to do the UTC <-> TAI conversion on Utc::now() for platforms that don't support CLOCK_TAI (not linux, I haven't checked windows or darwin). OTOH it makes all the processing of TimeDelta much simpler, because now we only need to operate on the underlying TimePoint in order to get the intuitive behavior.

What do you think?

djc commented 1 year ago

I'm thinking that it seems like a good direction in theory but it's also a pretty large change and I'm not completely sure that I have the time/energy available to digest it. Platform support is definitely an issue, we currently support Windows and a whole bunch of less popular platforms (Android, iOS, WebAssembly, Illumos, Haiku...) and I'm not sure what kind of performance regression would be acceptable. So if you were to start working on this I definitely think it should be an optional/added mode of operation, if only to relieve me from worrying about backwards/platform compatibility.

@esheppa thoughts?

Ekleog commented 1 year ago

About making it optional

So if the goal is to go through with the second option among my two bullet points above, then having this be an optional mode of operation would most likely be quite hard, as the internals of DateTime would need to completely change, to either allow or disallow "wrong" leap seconds. Backwards compatibility would obviously break (because DateTime would no longer be able to represent invalid leap seconds), but that'd be part of the goal.

About performance

As for the performance impact on any platforms that don't support CLOCK_TAI-like time fetching (which can be all of them as far as I'm concerned, we don't even need to implement CLOCK_TAI support to get proper TAI support in chrono IMO), the worst case by 2135 is having to deal with 28 (past) + 13 (potential future until 2035) = 41 leap seconds. This, with the naivest implementation, can be done with 7 u64 comparisons after each gettime syscall. My guess would be this would be completely acceptable performance changes.

And actually, if we allow ourselves to assume that most likely the system time will be set properly, we can reduce Utc::now() -> Tai::now() to 1 u64 comparison 90%-of-the-time, 2 u64 comparisons 9%-of-the-time, and 9 u64 comparisons in the 1%-of-the-time case where the system time is ill-set far in the past.

(On the other hand, if we go through with the first option I listed above, which I like much less, then we don't get the backwards compatibility hazard, but we have to eat the the full 2-u16-comparisons-and-7-u64-comparisons every time we try to use TimeDelta)

Current "Utc" is actually already Tai?

EDIT AFTER CHECKING THE CODE: It turns out Utc::now() currently should already return TAI, assuming that SystemTime::duration_since behaves correctly in the presence of leap seconds. My guess is it probably doesn't, but this is probably a bug I can take upstream to libstd (see https://github.com/rust-lang/rust/issues/77994); if the bug is accepted then the purpose of this refactor would mostly be making naming work properly and actually giving out Utc and not Tai when the user asks for Utc.

But seeing that code makes me wonder: Does chrono ever build a DateTime with a leap second, or is all the current infrastructure (>1B subsec_nanos) code paths that are never exercised in practice because most likely no developer will ever manually create a leap-second datetime manually? In which case, an alternative to the current TAI proposal, would be to just remove leap second support from chrono for 0.5, as AFAICT it does not bring any actual benefit?

About the one actual concern I think there should be

So the actual concern I actually have is not about performance (because as per the above I think it's a no-op) or platform support (because we can just take 2 u64 comparisons after a syscall and not even use CLOCK_TAI on linux almost all the time).

However, it is about API. Basically, the question is, what should happen if the user knows there is a leap second that chrono does not know about, and wants to create a DateTime at that time point? If we prevent creation of a DateTime with wrong leap seconds (which helps quite a lot in making sure that a datetime plus one second is actually a single easily-computable value), then it means that chrono must know of all the leap seconds that the user wants to use.

My first thought on this problem is, "this is probably not actually a big deal". The end-user probably doesn't care or want to care about leap seconds anyway, so leap seconds far into the future can probably be assumed not to actually exist.

However, some users may care. So for these users the chrono API could introduce a function to set a program-wide "leap seconds list" that would allow the end-user to provide a file with leap seconds if the binary they're using has been built for that support.

This sounds like lots of additional complexity. But I'd like to remind you that this does not need to be implemented in order for the leap second handling of chrono to already be better than right now. Basically, as of today, AFAICT chrono completely ignores leap seconds in everything but their representation. Which means that even though I'm actually being concerned by "what about future leap seconds chrono won't know about"… well chrono wouldn't know about them already anyway.

The worst that could happen would be that a TAI timestamp is serialized, then the program is rebuilt with an updated chrono that actually knows of one more leap second before the TAI timestamp, and now the utc datetime becomes offset by 1 second compared to when it was serialized. But… is it actually a problem? The alternative is issues like https://github.com/chronotope/chrono/issues/944 and the fact that eg. using chrono to sleep for .5 seconds may actually sleep for 1.5 seconds even on known leap seconds.

Anyway, sorry for the long block of text, but this I think says all I've been thinking about refactoring chrono to be based upon TAI :)

scottlamb commented 1 year ago

Also, at least on Linux, there's supposed to be a CLOCK_TAI parameter that should allow taking Tai::now() without having to do the (currently) log2(28) timestamp comparisons that are required to convert between TAI and UTC.

Not sure it's useful. IIRC on bootup, the difference between TAI and UTC is defined to 0, which of course is wrong. ntpd/chrony/whatever may or may not set the difference properly when the system boot gets that far. And I don't think it's updated during the smear. I don't think there's any reliable way to detect if the current system is using a smeared clock. At least NTP servers tell you this but I'm not sure we can assume localhost is running a queryable NTP server, even if it's using an NTP client to set the clock.

The leap second database would be handled similarly to the tzdata for chrono-tz, though maybe manual handling might be enough given there are currently literally very few such seconds?

I think we more or less can rely on the system having up-to-date tzdata (it's packaged and released regularly) but IIRC on e.g. Debian those files don't actually include leap tables (even though the format defines a place for them). Hardcoding the table into the binary is possible but means something new: the accuracy may rot if you go for more than six months without a release. Maybe that's acceptable (who knows if there will ever even be any more leap seconds) but at least is worth calling out.

So if you were to start working on this I definitely think it should be an optional/added mode of operation, if only to relieve me from worrying about backwards/platform compatibility.

What would optional mean? I suppose something like extra methods like .get_duration_including_leap_seconds() would be possible, but then the "correct" calculation would be really awkward to spell. I wouldn't want the same code to compile and produce different results based on a feature flag or something, and I imagine that's not what you meant.

Ekleog commented 1 year ago

I've just posted an issue that lists all the options I can think of for handling leap seconds, hopefully it will make the discussion easier :)

https://github.com/chronotope/chrono/issues/954

I haven't mentioned the option of taking one idea and putting it behind a feature flag, as I'm in line with @scottlamb on this question and don't really know what that would mean ^^'

chronotope / chrono