Open BurntSushi opened 7 months ago
At time of writing, I do have leap second support working. Of note:
civil::Time
can now store a second value of 60
. I think this is pretty much required because I anticipate datetime parsing/formatting to go through civil times. And it is just in general sensible to make conversion from a DateTime::constant(2016, 12, 31, 23, 59, 60, 0)
to an Timestamp<Tai>
/Zoned<Tai>
do the "right" thing.civil::Time
is generally ignored/dropped as soon as you start doing anything with the Time
value. For example, adding 1
to 23:59:59
doesn't give you 23:59:60
. It just gives you 00:00:00
. And similarly, subtracting 1
from 23:59:60
gives you 23:59:59
. That means arithmetic is not associative, which is annoying. But I don't really see another way?TimeScale
abstraction and made Timestamp<Unix>
the default. That means leap seconds are ignored by default. You have to opt into them with Timestamp<Tai>
. I do overall think this is an unfortunate complication that bloats the API. Because you want things like Timestamp::now
to give you a concrete value without relying on return type polymorphism/inference, and so it ends up returning Timestamp<Unix>
. But then you need to also offer, e.g., Timestamp::now_with_scale
. And this sorta repeats itself on various other APIs.1 year 1 second
. We can collect user feedback on this behavior once Jiff is released, but I kinda expect the response is going to be "just use Unix time."I am now contemplating making a completely distinct TaiTimestamp
type (naming TBD) that one must explicitly convert to-and-from via an Timestamp
. And then removing the type parameter on Timestamp
and Zoned
. And moreover, not adding a TaiZoned
type. This would pollute the top-level API namespace with an additional type, but it would also simplify the API interactions between Timestamp
, Zoned
and civil::DateTime
.
I have also considered making "is TAI" or "is Unix" a runtime switch inside an Timestamp
itself, so that it can be seamlessly converted to-and-from TAI without it being observable in the type system. This simplifies the API, but I think would otherwise make things far too confusing. And it would be too easy to attempt to perform operations on pairs of instants in different time scales, which seems... also very confusing.
OK, I've been doing a lot of tortured thinking about this. I am basically completely ambivalent on this topic and I don't know what the right answer is. On the one hand, the vast majority of all "general purpose" datetime libraries either completely ignore leap seconds or handle them in weird and inconsistent ways, and so, there is an opportunity to Do Something Better. On the other hand, I see a lot of folks specifically desiring support for leap seconds. On the other other hand, most materials put out by experts in the "general purpose datetime library" space seem to be quite strongly opposed to handling leap seconds at all. As documented above, the Temporal proposal completely ignores their existence, other than clamping times like 23:59:60
to 23:59:59
transparently.
What that means for me is that I would really like to support leap seconds in some way. But:
So I'm in this position where I'm looking for a way to support leap seconds but in a way that has very little impact on overall API complexity. Basically, since I perceive that most people will want to ignore leap seconds most of the time, we should provide an API that optimizes for the common case. In other words, because of the niche use cases of leap seconds, I'm opposed to it having an outsized impact on the API.
At this point, I am rather close to just cutting leap seconds out of the public API of Jiff. At least, for the initial release, until I can hear compelling use cases for why folks need them. Maybe the use cases are extremely limited and we can indeed get away with something simpler.
So I thought I would outline some of the various API design ideas I've had for dealing with leap seconds.
Timestamp
and Zoned
generic on a TimeScale
traitThis is the status quo as I type this. The TimeScale
trait as two implementations for zero sized types: Unix
and Tai
. The Timestamp
and Zoned
types both default to Unix
so that most times, you just write Timestamp
/Zoned
and get the right thing. That is, it pushes leap second handling out of the way... For the most part.
This generally works, but it has problems:
Timestamp
(and Zoned
) are generic, that means any routine returning an Timestamp<T>
requires callers to either call in a context where T
is known, or explicitly supply a type for T
. This in turn has lead to weird things like let zoned: Zoned = datetime.to_zoned()?;
. Which is, of course, equivalent to let zoned: Zoned<Unix> = datetime.to_zoned();
. But most people shouldn't need to care about leap seconds, and so, it should just be written as let zoned = datetime.to_zoned()?;
. We could make to_zoned()
return a Zoned<Unix>
(which would just be written as Zoned
because of the default type parameter), but then we'd need to add a to_zoned_with_scale<T: TimeScale>() -> Zoned<T>
for the case where you do care about leap seconds. So basically, we either inconvenience everyone or bloat the API with more methods. I am not a fan of either one, but I think bloating the API is the better choice.TimeScale
in places, even though most people will never need to care about it. This, IMO, increases the cognitive burden of reading and understanding APIs. For most people where Unix time is just fine, there's going to be this TimeScale
nonsense spread out across all of the APIs that they now need to deal with and understand. This can, in some cases, be mitigated by once again bloating the API by providing concrete versions for Timestamp<Unix>
/Zoned<Unix>
everywhere. But it becomes a real nuisance IMO.Timestamp
and Zoned
infects everything that contains an Timestamp
or a Zoned
. For example, the "relative to" input to rounding embeds an Timestamp
inside of it. That in turn means we either need to only accept an Timestamp<Unix>
(thus forbidding leap second handling when doing rounding relative to an instant) or we need to make "relative to" generic over TimeScale
as well. Basically, the type parameter can't be encapsulated and it spreads its complexity out to everything it touches.I could maybe accept some of this if leap seconds were a more critical thing to handle. But I think the existing evidence is that they aren't critical, and so the mess they impose on the public API in the status quo is just not worth it.
TaiTimestamp
and make Timestamp
/Zoned
always use Unix timeThe idea here is that you could still compute spans of time accurately by converting an Timestamp
/Zoned
to a TaiTimestamp
, but you wouldn't be able to do time zone aware and leap second aware operations at the same time.
This would also support the use case of roundtripping leap seconds through serialization, but you'd have to use TaiTimestamp
. And in particular, if you have a civil::DateTime
, we'd need a separate to_tai_instant()
method. Basically, once you convert a time to a Unix timestamp, you lose the precision of whether it corresponds to a leap second or not. (Because Unix timestamps are ambiguous.) So dealing with leap seconds is not really compositional.
I don't like this solution because it feels like a half-measure. But I confess that this is the most attractive option from my perspective. It generally stays out of the way. And it could even be made an opt-in crate feature. That is, it is strictly additive. It doesn't get in the way of the common case. On the other hand, it does permit leap second accurate time deltas and round-tripping leap seconds through serialization. But, unfortunately, it doesn't let you do time zone aware calculations. We could add a TaiZoned
type to do that, but I feel like that would be a lot of extra code.
Timestamp
/Zoned
always use TAI timestamps internallyOne of the problems with centering everything on Unix timestamps (which is what Timestamp<Unix>
is) is that they are ambiguous. That is, once you have a Unix timestamp, you can no longer tell whether it corresponds to a leap second or not. There's just nothing you can do. For a positive leap second, it's exactly like a backwards DST transition (a "fold"): 2024-11-03T01:30[America/New_York]
could be 2024-11-03T01:30-05:00[America/New_York]
or it could be 2024-11-03T01:30-04:00[America/New_York]
. There's no way to tell!
But... if we were to always store our timestamps as TAI, then we could completely avoid the problem of ambiguity. We'd always know whether we're at a leap second, and we could always unambiguous convert it to Unix time. (Although, of course, in the case of a positive leap second, two distinct TAI timestamps will map to a single Unix timestamp. But that would presumably be a choice made by the user of the library. If you stay in TAI, then you'll never have ambiguity.)
The main problem I perceive with this approach is that every Timestamp
/Zoned
is a TAI timestamp internally. This means that every time you create a an Timestamp
from Unix time, you'll need to do a leap second table lookup and some arithmetic to convert to TAI. And similarly, to convert a TAI timestamp back to Unix, you'll need to do the inverse. There are ways to optimize this of course, but it is fundamentally more expensive than what it would be if the internal representation were Unix time. (That is, this is "some cost" versus "definitively free.")
Now to be fair, I don't know exactly how much that conversion cost is or how much it matters. But I'm very hesitant to bet the entire library's API design on it. There are perhaps tricks we could play where the internal representation is configurable via crate features, but it seems like a really bad idea to be able to change the internal representation just because some other crate in your tree wanted to deal with leap seconds. Still, I think it is conceivably a possibility.
One very common suggestion I've seen in various places is that TAI can just be another time zone. And in particular, via a TZif file. This works because TZif is effectively just a list of timestamps that correspond to transitions with offsets. In the case of leap seconds, there's 27 such transitions with the initial time being offset by 10 seconds. And each offset is just 1 second.
This does actually work, but it only solves one part of the problem of dealing with leap seconds: computing accurate deltas between two points in time. Otherwise, it doesn't really do anything to help with round-tripping leap seconds. In order to do that, you really need some special handling of civil times like 23:59:60
. Moreover, by encoding TAI as a time zone, you can't use TAI timestamps with other time zones. And you kinda should be able to right, because time scales and time zones are orthogonal concepts.
Implementing TAI as a time zone is a nice hack that will possibly solve one part of the leap second problem. But it conflates concepts and really isn't appropriate to offer as a blessed solution in a library. It's the kind of thing you use if you really need leap second handling for computing accurate time deltas and the datetime library doesn't otherwise support leap seconds (which is true for nearly all of them, except for specialized libraries). Indeed, I would be curious just how often this hack is used. My perception is that it's very rarely used, which I think reinforces my perception that leap second handling is a niche use case not worth officially supporting.
That is, represent an Timestamp
/Zoned
with either Unix or TAI time, but do it in a way where the representation is determined at runtime and is not visible to the type system.
I think this is a really bad idea because it would then be incredibly easy to mix two distinct Timestamp
values that are on two different time scales. This would in turn likely lead to runtime checks to ensure any operation on two Timestamp
/Zoned
values were in the same scale, and then panic (or return an error) if they weren't. This seems like a non-starter to me for ergonomic, comprehensibility and performance reasons.
I haven't really been able to come up with any other plausible point in the design space. From where I'm standing, the status quo is probably the most "obvious" approach in that it works, but I find its impact on API complexity to be nearly intolerable. Otherwise, the two most attractive options to me are a distinct TaiTimestamp
type or making an internal TAI representation conditional on a crate feature. Although I do feel like the internal TAI representation via an opt-in crate feature is a testing nightmare.
As I type this, I am extremely close to just ripping leap second support out entirely, and re-considering this once folks can provide some detailed use cases for why they want/need it. (And, for example, why they can't use a crate like hifitime
for their specific use case.)
As my previous comment probably hinted at, I was heavily leaning towards ripping out leap second support altogether. I've just completed that. Jiff now behaves as if leap seconds do not exist with one exception: when parsing 23:59:60
, the 60
is automatically clamped to 59
. This is identical to the leap second support found in Temporal.
While I would like to support leap seconds in some capacity, I think it will be much easier to add it in later rather than start with a flawed approach. Otherwise, someone is likely to depend on it and get upset if I wind up removing it or changing it significantly.
Leap seconds are not predictable and can change on short notice. As far as I know, this is also true of time zones. Yet, we don't use that as a reason to not support time zones. We've already "solved" the problem of mutable time zone data: environments ship copies of the IANA Time Zone database that most all datetime libraries know how to read.
I'd rather say we have to deal with mutable time zone data, because otherwise it's impossible to support time zones with any pretense on correctness. And the issues associated with keeping these data up to date are only resolved in the sunny day case: if your environment does not provide tzdata, or if you fall behind on updates (or lack the capability to receive updates at all), your time arithmetic is liable to go wrong.
Bringing similar issues into computations on seconds is to open another can of worms that I expect very few customers of a time library would want to deal with. In the industry, the predominant thinking has been going in the other direction: push back on any explicit accounting of leap seconds and use smeared NTP time instead.
I've put out a more elaborate write-up on the topic in a users.rust-lang.org post; @BurntSushi has read it and commented, but I'm linking here for anyone who might find it useful.
I'd rather say we have to deal with mutable time zone data, because otherwise it's impossible to support time zones with any pretense on correctness.
I think you could say the same about leap seconds though. It just comes down to what kind of precision you attach to notions of "correctness."
Bringing similar issues into computations on seconds is to open another can of worms that I expect very few customers of a time library would want to deal with. In the industry, the predominant thinking has been going in the other direction: push back on any explicit accounting of leap seconds and use smeared NTP time instead.
I do generally agree, and I think it's ultimately why I ripped leap second support out of Jiff before I released it. It's not a binary thing though. Like, I'm still actually open to some kind of leap second support in Jiff, but it needs to be non-invasive precisely because, like you, I expect very few people to actually want to deal with them.
// For `time`, leap seconds are rejected when parsing // via the ISO 8601 format, but are supported via the // RFC 2822 format. We want to be maximally charitable // here, so pick a format that supports leap seconds.
Nit: time
does support parsing leap seconds in ISO 8601. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9a585f6e3961acce0d07b1c7e3c029fc
Conversely, I do not believe the internal design of either chrono or time allows it to satisfy property 3 easily.
In its current state, this is correct (at least for time
). I do not intend to ever support it with PrimitiveDateTime
, but OffsetDateTime
and a future ZonedDateTime
may.
I'm not here to advocate for or against leap second handling, and you clearly have thought more about date/time API design than I have, but I have a minor question or comment.
By "minor" I mean I don't think it will be important in significantly helping you improve this crate, so feel free to ignore it.
However, if our instants accounted for leap seconds, then
2017-03-01T00:00:00Z - 2016-03-01T00:00:00Z
would give you a result of1 year 1 second
. I frankly find that to be a bit weird, and is analogous to2024-11-04T00:00:00-05[America/New_York] - 2024-11-03T00:00:00-04[America/New_York]
yielding1 day 1 hour
instead of the more intuitive and typically desired1 day
.
It seemed to me that this problem, this conflict between two potentially desirable results to which the subtraction should evaluate, stemmed from not distinguishing between calendar time and (making up a term) what one might call physical time either at the type level or with different methods (e.g. a - b
vs a.sub_physical(b)
); and this also seems to apply to the tension in the following paragraphs between interpreting units of time span as variable-length vs fixed-length.
I now see that not distinguishing between calendar time and physical time was a decision explicitly made. Though the notes on this crate's design, including trade-offs, are impressively extensive, I don't see that they connect the issues of (1) time differences accounting (or not) for leap seconds and (2) having (or not) separate types for what you term calendar time vs absolute time. Would you agree or disagree that the tension in whether those time differences should be "1 year" or "1 year 1 second" and "1 day 1 hour" or "1 day" is one of the consequences of not distinguishing between calendar and absolute time? (Maybe this was documented as a trade-off and I missed it.)
// For `time`, leap seconds are rejected when parsing // via the ISO 8601 format, but are supported via the // RFC 2822 format. We want to be maximally charitable // here, so pick a format that supports leap seconds.
Nit:
time
does support parsing leap seconds in ISO 8601. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9a585f6e3961acce0d07b1c7e3c029fc
I think what I meant here is that time
would fail the test I set out to try. Specifically because the leap second wouldn't be distinct from all other times: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e76fc3c9dbcfd0541cb9ad5ea9f38a80
I now see that not distinguishing between calendar time and physical time was a decision explicitly made. Though the notes on this crate's design, including trade-offs, are impressively extensive, I don't see that they connect the issues of (1) time differences accounting (or not) for leap seconds and (2) having (or not) separate types for what you term calendar time vs absolute time. Would you agree or disagree that the tension in whether those time differences should be "1 year" or "1 year 1 second" and "1 day 1 hour" or "1 day" is one of the consequences of not distinguishing between calendar and absolute time? (Maybe this was documented as a trade-off and I missed it.)
I don't think I got this far in my thinking about leap seconds because this particular caveat of interaction between spans and leap seconds wasn't ultimately the thing that drove me to rip out support. I think I would have lived with it because leap second support still would have been opt in. (That is, the default time scale of Timestamp
would have been Unix, but you could have opted into TAI.)
In terms of calendar/clock units (or more precisely, non-uniform and uniform units), I think what leap seconds do is put pressure on whether to treat all units above seconds as non-uniform or not. Right now, Jiff always treats hours or smaller as uniform. They always refer to the same absolute duration of time. But if leap seconds were supported, should they still be uniform? I think that would be the key question to address. But either way, I don't think the decision to combine non-uniform and uniform units into one type is that impacted by this. Because any way you slice it, you want calendar units somewhere in your crate API. And you'd still have to answer how leap seconds impact the units on calendar-only spans.
I think what I meant here is that
time
would fail the test I set out to try. Specifically because the leap second wouldn't be distinct from all other times: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e76fc3c9dbcfd0541cb9ad5ea9f38a80
Ah. The level is support is the same for any well-known format that requires support for leap seconds.
Would you be interested in a crate feature called, e.g. "leap-seconds", which would convert a datetime to hifitime's representation to compute the number of leap seconds accumulated for a given time?
I've put in considerable amount of work in ensuring the hifitime is correct, even though properties 1+2 are glitchy in version 3 (and explicitly for +/- one second around the announced leap seconds -- https://github.com/nyx-space/hifitime/issues/255). Yet, the library is considered correct enough for fundamental timing research via GPS spacecraft by the Centre National de Recherches Spatiales (CNRS), and for spacecraft timing calculations by Amazon Ground Systems. I consider this a badge of honor, and I think that hifitime is one of the most precise and thorough time management libraries out there.
In other words, interop between Jiff and Hifitime would be beneficial for the few users of Jiff who need both the time zone management of Jiff and the time scales support of hifitime.
I think adding a dependency on hifitime
from Jiff is definitely not the right way to go. And the reverse seems similarly inappropriate as well. I think a separate crate is the way to go. In #50, I discuss how to approach ecosystem integration more generally, with this comment in particular being appealing to me. That doesn't mean jiff<->hifitime integration has to follow the exact same pattern, but "separate crate" is the right idea to follow I think.
Leap second support isn't a priority for me, so if you wanted to pursue publishing a crate that integrates Jiff and hifitime, that might be nice. It might make sense to take the approach of providing extension traits that add methods to Jiff/hifitime data types that allow converting to-and-from the library types. That way, the library isn't leap second focused, but "Jiff and hifitime interop"-focused, if that makes sense.
I'd like to make some tweaks to the docs to help users understand the hazards around time maths, and which operations (like Sub
) are approximations when done on POSIX time. Would a PR on that be welcome?
I think it would be better to put that in _documentation::design
perhaps? Adding warnings about leap seconds to every arithmetic operation between datetimes that returns a duration doesn't sound like a good idea to me personally. If you want to make a more compelling case for it, please open a new issue that discusses the specific documentation changes you'd like to make.
FWIW, I implemented support for leap second in the Perl DateTime.pm module many years ago. I think it satisfies all 3 of your qualifications (more or less).
Parsing from a string into a Timestamp like 2016-12-31T23:59:60Z (the latest leap second to have occurred as of 2024- 04-06) would correctly represent this time as distinct from all other times, including 2016-12-31T23:59:59Z.
DateTime itself doesn't do parsing. But its constructor will accept leap seconds:
my $dt = DateTime->new( year => 2016, month => 12, day => 31, hour => 23, minute => 59, second => 60, time_zone => 'UTC' );
Converting an Timestamp back into a string would correctly preserve the leap second.
my $dt = DateTime->new( year => 2016, month => 12, day => 31, hour => 23, minute => 59, second => 60, time_zone => 'UTC' );
print "$dt\n";
# prints "2016-12-31T23:59:60"
> Computing the difference between two Timestamps would correctly account for leap seconds. For example, 2017-01-01T00:00:00Z - 2016-12-31T23:00:00 would yield 3601 seconds instead of 3600 seconds.
It does this too:
```perl
my $dt1 = DateTime->new( year => 2016, month => 12, day => 31, hour => 23, time_zone => 'UTC' );
my $dt2 = DateTime->new( year => 2017, time_zone => 'UTC' );
my $dur = $dt2->subtract_datetime_absolute($dt1);
print $dur->seconds, "\n";
# prints "3061"
And based on all of this what I would say is ...
Definitely don't do this! This was a huge amount of work that I think only makes this library worse in almost all cases. It's just another thing to think about when dealing with datetimes that the vast, vast majority of applications can ignore.
The few cases that care about this sort of stuff, like scientific applications, should be using something much simpler that just represents time as a count of nanoseconds or some other tiny unit. For a general purpose library that attempts to represent the way humans interact with datetimes, leap seconds are unnecessary.
What should you do if you want leap second support in Jiff?
The most effective thing you can do is leave a comment in this issue with a detailed account of why you want leap second support. Leaving a comment like, "I need to-the-second precision when calculating differences between instants" is not sufficient. You should say why you need that level of precision, and especially, what would happen if you didn't have it. You should also share what your current solution is and any relevant "industry standards" in your domain. (So for example, if your use case is, "I need to do astronomical math," then it would be nice to say why you can't or don't want to use a specialized library for this purpose.) The most important thing you can do is connect specific support of leap seconds to an end user experience. Nebulous "I want it because it's correct" comments are not especially helpful.
The rest of this issue does a deep dive on what leap second support means and various API designs.
NOTE: When I originally wrote this,
Timestamp
was calledInstant
. I tried replacing references toInstant
withTimestamp
below, but a timestamp is still referred to as an "instant" in places. This concept is still correct. ATimestamp
is an instant. The only thing that changed was the type name in Jiff.This required background reading gives a nice overview of the relationship between different time scales. (This also makes a proposal at the end, but that's not particularly relevant here. The content leading up to the proposal though I found quite helpful in understanding things.)
Here is a bit more background reading, which serves as a nice way of observing that pretty much everybody gets this wrong in some way. And the reason why we get it wrong pretty much comes down to the fact that we like to represent time as "seconds since an epoch."
And finally, this reference about Time Standards I found to be a bit ranty, but also very informative. It's nice because it starts from first principles and builds up everything from there.
Initial thoughts and ecosystem comparison
When I initially started developing Jiff, I had a goal to provide support for leap seconds. The failure to support leap seconds is a common indictment of datetime libraries, and I wanted to avoid falling into that trap. Roughly speaking, "support" in my mind looked like this. And note that when I say
Timestamp
, it also applies toZoned
(but not necessarilycivil::DateTime
).Timestamp
like2016-12-31T23:59:60Z
(the latest leap second to have occurred as of 2024-04-06) would correctly represent this time as distinct from all other times, including2016-12-31T23:59:59Z
.Timestamp
back into a string would correctly preserve the leap second.Timestamp
s would correctly account for leap seconds. For example,2017-01-01T00:00:00Z - 2016-12-31T23:00:00
would yield3601
seconds instead of3600
seconds.Existing Rust datetime libraries have pretty spotty support for all of the above. Let's test each of the above properties in isolation for each of
chrono == 0.4.34
,time == 0.3.34
andhifitime == 3.9.0
. First, can the library handle parsing a leap second such that the datetime values are distinct?Only
chrono
andtime
gets this particular case right.Next is what happens when we try to format an instant corresponding to a leap second. This one is tricky to test in isolation because it requires the library itself to be at least minimally aware of leap seconds as a concept. If, for example, the library only deals with Unix time (which is what
std::time::SystemTime
is), then it's impossible for the library to get this case correct. That's because Unix time is ambiguous with respect to leap seconds, in exactly the same way the civil datetimes are ambiguous with respect to DST transitions. For example, the Unix timestamp1483228799
could correspond to2016-12-31T23:59:59Z
or to2016-12-31T23:59:60Z
. There is quite literally no way to distinguish between them. Analogously,2024-11-03T1:30:00[America/New_York]
could correspond to either2024-11-03T05:30:00Z
or to2024-11-03T06:30:00Z
because of the jump backwards from DST to standard time. (This analogy between leap second and DST handling will prove to be prescient.)In other words, to get the second property right, you kind of need to pass the first property. That is, both properties are required in order to correctly roundtrip datetimes. Since
hifitime
didn't satisfy the first property, we will manually create an instant corresponding to a leap second (whichhifitime
explicitly supports by virtue of representing its instants in TAI) to test whether it can format the leap second correctly:chrono
passes this test, but neithertime
norhifitime
do.hifitime
's documentation seems to suggest that this it should satisfy this property (and the first), but neither seem to be the case here.The third property is a bit different from the first two. The first two deal with the case where the leap second hits you directly. That is, the time being dealt with is itself a leap second. This is somewhat of a particularly rare case because POSIX/Unix time is pretty ubiquitous and POSIX/Unix time simply ignores leap seconds altogether. So it's pretty rare to see something like
2016-12-31T23:59:60Z
in the wild, even when it accurately and precisely describes the current instant in time. Instead, you'll just see2016-12-31T23:59:59Z
.The third property however tests the handling of leap seconds implicitly. That is, even when two times aren't themselves leap seconds, does computing the difference between them account for leap seconds if one or more of them occurred in the interval of time between them? Let's see how the various Rust libraries do on this test:
Interesting! Despite not satisfying properties 1+2,
hifitime
does satisfy property 3. And in particular, neitherchrono
nortime
satisfy property 3. This is likely becausehifitime
should satisfy properties 1+2, but perhaps there are some bugs in its implementation. Conversely, I do not believe the internal design of eitherchrono
ortime
allows it to satisfy property 3 easily.The salient difference between
hifitime
andchrono
/time
is thathifitime
's representation of time is TAI. That is, International Atomic Time. TAI doesn't have leap seconds. It just always ticks forward without regard to whether it's in sync with the rotation of the Earth. This means that you can subtract any two points in TAI time and get a precise answer as to how many seconds elapsed between them. The way this is done inhifitime
(at present) is, as I understand it, as follows:/usr/share/zoneinfo/leapseconds
).std::time::SystemTime
is) or UTC to anEpoch
, it "applies" the leap seconds relevant for that particular instant in time. So for example, to convert2016-12-31T23:59:59Z
to TAI, you'd add 37 seconds. (That's 27 leap seconds added since 1972 plus the initial 10 second difference definitionally declared on 1972-01-01 between TAI and UTC.) That is, the internal representation of anEpoch
is just TAI. (Or more precisely, the amount of time that has passed from its chosen epoch,1900-01-01T00:00:00
.)Epoch
, you need to subtract the leap seconds from TAI to get UTC back.This makes it somewhat clearer why properties 1+2 are perhaps not as well tested (or as high priority) in
hifitime
. Namely, they only really apply when dealing with a leap second directly, where as property 3 applies much more frequently. The only thing that needs to be true for property 3 to matter is for at least one leap second to have occurred between two instants.The downside of
hifitime
's approach is that it needs to do a non-trivial calculation every single time anEpoch
is created from a Unix timestamp and every single time anEpoch
is converted to a civil datetime (whichhifitime
doesn't have).Observations
After digesting the above, I went and looked at what other datetime libraries. From what I could tell, there are essentially zero general purpose datetime libraries that deal with leap seconds in a way that satisfies all three of the above properties. This includes Joda Time (now
java.util.time
) and the in-development Temporal library. In particular, Temporal say this:and this
on the topic of leap seconds.
Notably, C++'s standard
chrono
library does support leap seconds by introducing a "clock" abstraction where one can convert between system (Unix) time and UTC. That conversion step essentially does the same leap second calculation thathifitime
does.Of course, there are other more specialized datetime libraries that do support leap seconds. For example, Time4J.
Given that Temporal was my principal influence in the design of this library, I wanted to defer strongly to them. Still, though,
hifitime
's approach, to me, made it look like supporting leap seconds wasn't that crazy. On the other hand, I didn't want to introduce the overhead of dealing with leap seconds for almost all operations on a JiffTimestamp
...Making
Timestamp
generic over aTimeScale
So my initial plan to deal with leap seconds was to make the
Timestamp
type generic over a trait calledTimeScale
. I perceived this as pretty similar to whatchrono
does in C++. The default would be theUnix
time scale. That is, leap seconds are completely ignored. This may not produce exactly correct result when computing the difference between two instants, but it is synergistic with how most system clocks work. And there wouldn't be any additional overhead.The idea then was that users of Jiff could opt into leap second handling by using the
TAI
TimeScale
instead ofUnix
. In so doing, any construction of anTimestamp
from Unix time would require doing the same sort of leap second calculation thathifitime
does. And then whenever one needs to convert back to a civil datetime, the same calculation would be done in reverse. In exchange, subtracting twoTimestamp
values in theTAI
time scale would produce precise results. Unlikehifitime
, this design wouldn't permit subtracting any twoTimestamp
values regardless of their time scale because the internal representation would be defined by theTimeScale
. Indeed, this is the critical difference fromhifitime
's design that permits us to make leap second support opt-in without the overhead of always needing to deal with it. The downside is that you don't get any kind of interoperability between instants of different time scales. But this seemed fine to me, because you could always explicitly convert between time scales.I was not a huge fan of this design because it adds a type parameter to a foundational type, and has a somewhat infectious effect. That is, anything that deals with
Timestamp
s now needs to also consider the time scale in its API. I found this to be annoying to deal with even internally. For example, I wanted to useTimestamp<Unix>
to represent timestamps from TZif files (indicating the start of DST transitions inUnix
time), but the API wants to accept any genericTimestamp<S>
. Since my design means you can't compare instants of different time scales, every time zone lookup would in turn need to do a leap second conversion. This was bad juju to me and finally convinced me to get rid of this approach.(There does exist the
/usr/share/zoneinfo/right
database of TZif files which defines its transitions inTAI-10
time that could potentially be used instead. But dealing with this is a pretty complex mess, because you'd sometimes want to use the "normal" TZif files and sometimes want to use the "right" TZif files, and which one you pick depends on theTimeScale
of the instant you have. And what happens when theright
directory doesn't exist?)Status quo
At time of writing, I ripped out the
TimeScale
abstraction. AnTimestamp
is now always Unix time. (And I'm now considering renaming it toTimestamp
because I don't like that it has the same name asstd::time::Instant
.)I haven't done parsing/formatting yet, but at present, I still currently plan to adopt Temporal's appoach at minimum: if a leap second is seen, then clamp it into a valid time.
But that still leaves the question of how to deal with subtracting two times that cross a leap second discontinuity and still get a correct result. It is my belief that this is quite a challenging problem, and not even
hifitime
gets it fully correct. Or I should say, it doesn't provide the right user experience. In particular, one nice thing about Jiff (and Temporal, which inspired this API) is that you can pretty seamlessly work with durations in any unit of time. For example, you can add things like "years" or even get "years" as a unit of difference between two datetimes. However, if our instants accounted for leap seconds, then2017-03-01T00:00:00Z - 2016-03-01T00:00:00Z
would give you a result of1 year 1 second
. I frankly find that to be a bit weird, and is analogous to2024-11-04T00:00:00-05[America/New_York] - 2024-11-03T00:00:00-04[America/New_York]
yielding1 day 1 hour
instead of the more intuitive and typically desired1 day
. Indeed, this is what Temporal does.It's worth lingering on this point for a bit, because the way Temporal works here is very intuitive. Namely, when adding units less than a day (hours, minutes, seconds, milliseconds, microseconds, nanoseconds) even when they add up to move than a day, then Temporal will incorporate DST transitions. For example, this program:
Has this output:
But if we change the duration to
days: 1
instead ofhours: 24
, we get this output:In other words, units of "day" or higher are of variable length depending on the datetime they are interpreted relative to. But units below "day" are always of a fixed length.
I think
hifitime
doesn't really have this "problem" because it doesn't have the same sort of intuitive interface for representation durations of time. That is, its unit of "day" is a fixed length of 86,400 seconds always. Indeed,hifitime
doesn't deal with DST at all. So whilehifitime
gets some part of leap second handling correct (even if we ignore it not satisfying properties 1+2 above as "just bugs"), it doesn't seem like a foundation one can build on to provide a nice user experience like what Temporal does. Indeed, I believe this also applies to C++'schrono
library, but my C++ knowledge is pretty weak and thus can't be sure of what C++'schrono
's higher level user experience is like.One way to look at the issue of leap seconds is that it is strictly analogous to DST handling:
In order to make this work at the user experience level, Temporal had to make any units above "hours" variable length. This is because all time transitions are expressed in units of hours (with a possible fractional component). So once you move above hours, it makes sense to "ignore" the time transitions. For example, if you have an appointment on
2024-03-09T14:00:00-05[America/New_York]
and want to re-schedule it to one day later, then the expected appointment time would be2024-03-10T14:00:00-04[America/New_York]
. If you implemented that re-scheduling by simply adding 24 hours, then you'd get the wrong result of2024-03-10T15:00:00-04[America/New_York]
.So if we wanted to deal with leap seconds in analogous way, this appears to lead us to the conclusion that any units above seconds need to be of variable length. That means, for example, that adding 1 minute to
2016-12-31T23:59:59Z
should give you2017-01-01T00:00:59Z
, while adding 60 seconds should give you2017-01-01T00:00:58Z
. Wild, right?From what I can tell, the division between which units are variable (days and higher) and which are not (hours and lower) in Temporal is an architectural constraint the pervades not just its API but also its implementation. For example, if you can assume that all units of hours or below are of a fixed length, then arithmetic using hours or below can be performed by simple addition on timestamps. At the API level, Temporal permits rounding durations to the nearest hour (or below) without specifying a
relativeTo
datetime. But if rounding days or above, since days and above are variable sized units, the operation necessarily requires a datetime to interpret the duration in relation to. Otherwise, it can't actually know how long a "day" is.While these constraints appear easy to mechanically lift, I think the actual implementation change would be pretty significant. I am basically a priori worried about this rabbit hole because it is seemingly one that a group of experts spending person-years of effort working on a datetime library designed to avoid footguns decided to punt on.
Plausible ideas for the future
I think I generally consider the approach of making units above "seconds" to be of variable length to be implausible. So that leaves us with a few choices:
TimeScale
approach work. For example, maybe we accept the perf hit of time scale conversions during a time zone lookup. Or perhaps wire it up somehow such that the common case (using the Unix time scale) doesn't do any additional work. Still though, that wasn't the only issue. There are other ergonomic issues I found with this, but maybe we shouldn't treat them as deal-breakers. As outlined above though, this would still not deal with datetime arithmetic in an intuitive way because units below "day" would still be fixed in length. There are also some open questions (for me) for how to tie this into formatting/parsing in a nice way. For example, I believe it would require parsing directly into anTimestamp<Tai>
or aZoned<Tai>
in order to support them, which wouldn't be the default. (Because civil datetimes are, by design, incapable of representing leap seconds.) This seems very non-ideal, because you basically have to encode the knowledge of "supports leap seconds" into the code itself by virtue of usingTimestamp<Tai>
instead ofTimestamp
, and because when you do this, you'll automatically lose intuitive arithmetic semantics whenever you cross a leap second boundary.Zoned
values already accounts for time zones. But I don't think it would do much more than that. Units of time below "days" would still be fixed, and so dealing with datetime arithmetic even in a hypothetical TAI time zone would still be unintuitive. This would have other downsides, for example, it wouldn't really be able to extend parsing/formatting such that a leap second could be round-tripped. So this would not help to satisfy either property 1 or 2. Moreover, currently, the API ofTimeZone
is defined in terms ofTimestamp
andcivil::DateTime
, neither of which can represent leap seconds.I overall, at this point, feel like Temporal probably made the best compromise here. We clamp incoming data to something sensible and otherwise ignore leap seconds. I also think that a lot of the trouble and bugs that come from leap seconds aren't necessarily because of datetime libraries not handling them, but rather, because programmers have assumed that their system time is monotonically increasing. Jiff doesn't provide any monotonic guarantees because it's a library for dealing with human time. Monotonic time is something entirely different, and thus, I'm not sure the benefits of handling leap seconds beyond what Temporal does is worth it.
More on Temporal
I think it's worth capturing some links to more Temporal materials about leap seconds. My intent here was to capture pretty much anything relevant about leap seconds from the Temporal repository/docs/meetings:
constrain
vsreject
behavior during parsing, but doesn't really discuss leap second support as a whole.23:59:60
is the same as23:59:59
.I've looked through all of these (and looked for additional materials), and I feel like all of the reasons given as stated are not particularly compelling. Instead, I think the real challenge to leap seconds is the following (a distillation of the above exploration):
I find the following reasons to be invalid or entirely unconvincing:
tzfile
forchrono
specifically, but its support is incomplete.)There also seems to be some confusion about what "leap second support" even means. Some people just care about round-tripping. Others just care about computing the accurate difference between two points in time. And yet others care about making the higher level user experience around leap seconds make intuitive sense.