Instant.from-posix has false future leap second knowledge

p6rt commented 8 years ago

Migrated from rt.perl.org#126119 (status was 'open')

Searchable as RT126119$

p6rt commented 8 years ago

From zefram@fysh.org

The Instant class supposedly represents times on the TAI time scale, with subtraction of Instants yielding counts of atomic seconds. The corresponding difference of POSIX time_t values yields a count of non-leap seconds. The difference between these two differences, for corresponding endpoints, therefore yields a count of leap seconds. Instant.from-posix() will conveniently perform the translation. So here's some code to count the leap seconds that occur in an interval specified by time_t values:

$ ./perl6 -e 'sub leaps-between ($a, $b) { (Instant.from-posix($b) - Instant.from-posix($a)) - ($b - $a) }; say leaps-between(time - 1000000000, time); say leaps-between(time, time + 1000000000)' 14 0

The first of the time intervals on which I've tried that runs from January 1984 to today, and the count of 14 leap seconds is historically correct. The second interval runs from today to May 2047... and will almost certainly contain a non-zero number of leap seconds. The count of zero is bogus.

In reality, we don't know how many leap seconds there will be in the next gigasecond. We can guess: anything from five to twenty is defensible. But there is no value that we can say is certainly correct. So Instant.from-posix(time + 1000000000) cannot produce any definitely-correct value. It really ought to signal an error.

If it is really intended to guess, in cases where a definitive answer is available, then the guess that it is making is crap. To make a reasonable guess, use a quadratic formula based on the observed tidal braking, and quantise it to one-second leaps on appropriate month boundaries. A finer guess could be made for the next few decades by extrapolating from recent decades' measurements.

Also, if it's a guessing function, it ought to be named in a way that clues in the user. "from-posix-best-guess" or so. It would also be sane to have both kinds of conversion function.

-zefram

p6rt commented 8 years ago

From @zoffixznet

My feedback for RFC:

Currently, the leap seconds are added as-they-get-known. For example, on my bleed Rakudo, I get 14/1 with your code, because the newly announced Dec.2016 leap second was added.

This means the result of the code is dependent on the version of the compiler and is thus inherently unpredictable, **even for historical values.**

So trying to use a "guessing formula" won't rectify the issue, since at best it can guess, but it would add overhead execution time. So IMO, the current behaviour is fine as is. If someone's application requires super precision they should make/use a module that can both utilize various guessing formulas and be updated on a particular system more easily than entire Rakudo.

-- Cheers, ZZ | https://twitter.com/zoffix

p6rt commented 8 years ago

The RT System itself - Status changed from 'new' to 'open'

p6rt commented 8 years ago

From @zoffixznet

This is a response to this Zefram's email that for some reason didn't make it to the ticket: http://www.nntp.perl.org/group/perl.perl6.compiler/2016/07/msg13370.html

Quoting Zefram \zefram@fysh\.org:

A version with reasonable guessing for `future' times would be qualitatively similar, in that it can produce either the right answer or a wrong answer, but the wrong answers would be quantitatively less wrong.

But that would prevent you from working around the issue and it wouldn't be useful. If your algorithm fails when future unknown leap seconds are zero, it won't magically succeed if given what amounts to a throw of dice. The only difference is you're paying with computing power for that dice throw (and .from-posix isn't the only method that has this leap second caveat).

Returning zero is much more predictable than returning a guess, because I can choose any guessing algorithm I wish, should I require it. But most importantly, I can just patch my code for older compilers when new leap seconds are announced. Here's a rough patch for your code snippet:

./perl6 -e 'sub leaps-between ($a, $b) { (Instant.from-posix($b) - Instant.from-posix($a)) - ($b - $a) + ($*VM.version before v2016.07 and $b after DateTime.new("2016-12-31T23:59:59") ?? 1 !! 0) }; say leaps-between(time - 1000000000, time); say leaps-between(time, time + 1000000000)'

Note that I was able to insert the now-known new leap second at the precise point of time it exists, and the code gives correct result regardless of whether I'm using a pre 2016.07 compiler. I was able to do so *precisely* because Rakudo does not try to guess what it doesn't know and returns 0 instead.

I documented the behavior and the workaround method for older compilers in the docs: https://docs.perl6.org/type/Instant#Future_Leap_Seconds When I get some tuits, I'll also release a module that would make it easier to get the correct number of seconds, regardless of the compiler. IMO that resolves this ticket. I'll close it in a few days, unless there's more feedback on the issue.

p6rt commented 8 years ago

From zoffix@zoffix.com

Perhaps, we should evaluate some of the leap-second estimation
algorithms. There are more leap second problems in Rakudo besides the .from-posix, that I
now opened in https://rt-archive.perl.org/perl6/Ticket/Display.html?id=128752

Quoting Zefram \zefram@fysh\.org:

Zoffix Znet via RT wrote:

Returning zero is much more predictable than returning a guess,

The case where the algorithm is operating in its unknown-future regime is not as easily spotted as that. The return value from the conversion is not a fixed value, it's a value that is well-formed and valid apart from (probably) being the wrong answer. But even identifying the last leap second the implementation knows about doesn't really tell you where its knowledge ends: at any time there is some period following the last scheduled leap second for which it is known that there will be no further leaps.

Signalling an error would make clear whether the threshold of implementation knowledge has been passed. Another possibility, which I didn't raise earlier, would be to export a value that explicitly identifies where the threshold is: one would document that the conversion only works for times earlier than this advertised threshold.

But most importantly, I can just patch my code for older compilers when new leap seconds
are announced.

That only handles leap seconds that the user code knows about specifically. If you pursue that approach you'd end up with a long and growing list of leap seconds in the user code, which rather defeats the point of using the core implementation which has such a list. The fixup code would be even more complicated than the core implementation, because it would also incorporate knowledge of which leap seconds are known to a long and growing list of core implementations. It would be easier to fully reimplement Instant.from-posix() with one's own leap second list, rather than use the core Instant.from-posix() and then fix up in this way.

There certainly are good reasons to have an explicit distinction between the known and unknown regions, but this fixup scenario doesn't make much sense. Your earlier "choose any guessing algorithm" is a better motivating scenario.

Note that I was able to insert the now-known new leap second at the precise point of time it exists, and the code gives correct result regardless of whether I'm using a pre 2016.07 compiler.

No, it does not. It does succeed in incorporating knowledge of that specific leap second, but the answer that it gives for the next gigasecond is 1 leap second, which is almost certainly not correct, just as the original 0 was almost certainly not correct. (Slightly less certain, of course, as it's a move in the direction of the likely range.) On future versions of Rakudo that know about even more leap seconds it will then give different results, progressively closer to correct, so the consistency across Rakudo versions that you achieve is rather limited.

Perhaps you only intended this code to be applied to the region for which the code has knowledge of the leap schedule, in which case you have achieved what you intended and the above would be an irrelevant trifle. But you did include the next-gigasecond invocation in your version of the example.

I was able to do so *precisely* because Rakudo does not try to guess what it doesn't know and
returns 0 instead.

That's not quite true. You were able to do it that way because you know exactly what guess the older versions of Rakudo would make. That guess didn't have to be the especially crap one that there would be no more leap seconds ever; it suffices that the guess is deterministic. But this feels trifling when, as I said earlier, I find this fixup concept untenable.

I documented the behavior and the workaround method for older
compilers in the docs: https://docs.perl6.org/type/Instant#Future_Leap_Seconds

That's certainly a significant improvement, thanks. There are a couple of issues with the wording, though.

"methods ... do not make any guesses" isn't really true, because the method does return a specific answer that implies a specific future leap schedule. From the point of view of the caller, it's guessing that there will never be any more leap seconds after the last one it knows about. If it signalled an error, that would constitute not guessing.

Also, "leap seconds in the future" reads as if it's referring to the future of when the method is called. This could do with some rewording in accordance with what we discussed upthread, to make clear that it may apply to leap seconds in the past of the call time. The true situation is somewhat implied by the subsequent discussion of "depending on the compiler version", but that comes across as conflicting with the "in the future" rather than as clarifying it.

Putting those together, I suggest that the first sentence of this doc section should read
The methods that involve knowledge of leap seconds always assume
that there will be no further leaps after the last leap second
that the implementation knows about, which may not be the last
leap second that has actually been scheduled\.
-zefram

p6rt commented 8 years ago

From zefram@fysh.org

Zoffix Znet via RT wrote:

This means the result of the code is dependent on the version of the compiler and is thus inherently unpredictable, **even for historical values.**

Absolutely right that the `future' behaviour also applies to times that are actually in the past, when running on an old Rakudo version. It's the future from the point of view of when the code was written that matters.

Be careful when you speak of unpredictability. There are different classes of unpredictability that are worth distinguishing. The current implementation is unpredictable in that it can produce either the right answer or a wrong answer, and the latter can be very wrong. A version with reasonable guessing for `future' times would be qualitatively similar, in that it can produce either the right answer or a wrong answer, but the wrong answers would be quantitatively less wrong. But an implementation that throws an exception for unknown times would only be able to either produce the right answer or throw an exception, and not able to produce a wrong answer. This is still in one sense unpredictable, but it's qualitatively better, and is predictably correct in the non-exception cases.

So trying to use a "guessing formula" won't rectify the issue, since at best it can guess, but it would add overhead execution time.

This is a reasonable position to take. It is fine to punt that kind of use case to the module ecosystem.

So IMO, the current behaviour is fine as is.

But this is a poor conclusion. If your position is that a good guess is no more use than a bad guess, this implies that you're only concerned about whether the answer from the function is correct or incorrect, and that an incorrect answer has no value. But with the current behaviour it's impossible to tell which you're getting, which means that the answer in any case, and therefore the function, is of no value at all. (Except when you know you're asking about historical times that all versions of Rakudo know about.)

If you're not interested in making a reasonable guess for the unknown cases, the only sensible behaviour for those cases is an exception.

If you really really want to bless the current behaviour, then the function needs to be documented with appropriate qualification about the quality of the answer. Since the answer for the `future' period is garbage, and the caller can't know when that period begins, you'd need to caution the user in terms such as "Only good for times up to the year 2014; when applied to any later time the result is meaningless.". Once you've got that in the API definition, of course, you might as well enforce it by having the function throw an exception for anything past 2014.

-zefram

p6rt commented 8 years ago

From zefram@fysh.org

Zoffix Znet via RT wrote:

Returning zero is much more predictable than returning a guess,

The case where the algorithm is operating in its unknown-future regime is not as easily spotted as that. The return value from the conversion is not a fixed value, it's a value that is well-formed and valid apart from (probably) being the wrong answer. But even identifying the last leap second the implementation knows about doesn't really tell you where its knowledge ends: at any time there is some period following the last scheduled leap second for which it is known that there will be no further leaps.

Signalling an error would make clear whether the threshold of implementation knowledge has been passed. Another possibility, which I didn't raise earlier, would be to export a value that explicitly identifies where the threshold is: one would document that the conversion only works for times earlier than this advertised threshold.

But most importantly, I can just patch my code for older compilers when new leap seconds are announced.

That only handles leap seconds that the user code knows about specifically. If you pursue that approach you'd end up with a long and growing list of leap seconds in the user code, which rather defeats the point of using the core implementation which has such a list. The fixup code would be even more complicated than the core implementation, because it would also incorporate knowledge of which leap seconds are known to a long and growing list of core implementations. It would be easier to fully reimplement Instant.from-posix() with one's own leap second list, rather than use the core Instant.from-posix() and then fix up in this way.

There certainly are good reasons to have an explicit distinction between the known and unknown regions, but this fixup scenario doesn't make much sense. Your earlier "choose any guessing algorithm" is a better motivating scenario.

Note that I was able to insert the now-known new leap second at the precise point of time it exists, and the code gives correct result regardless of whether I'm using a pre 2016.07 compiler.

No, it does not. It does succeed in incorporating knowledge of that specific leap second, but the answer that it gives for the next gigasecond is 1 leap second, which is almost certainly not correct, just as the original 0 was almost certainly not correct. (Slightly less certain, of course, as it's a move in the direction of the likely range.) On future versions of Rakudo that know about even more leap seconds it will then give different results, progressively closer to correct, so the consistency across Rakudo versions that you achieve is rather limited.

Perhaps you only intended this code to be applied to the region for which the code has knowledge of the leap schedule, in which case you have achieved what you intended and the above would be an irrelevant trifle. But you did include the next-gigasecond invocation in your version of the example.

I was able to do so *precisely* because Rakudo does not try to guess what it doesn't know and returns 0 instead.

That's not quite true. You were able to do it that way because you know exactly what guess the older versions of Rakudo would make. That guess didn't have to be the especially crap one that there would be no more leap seconds ever; it suffices that the guess is deterministic. But this feels trifling when, as I said earlier, I find this fixup concept untenable.

I documented the behavior and the workaround method for older compilers in the docs: https://docs.perl6.org/type/Instant#Future_Leap_Seconds

That's certainly a significant improvement, thanks. There are a couple of issues with the wording, though.

"methods ... do not make any guesses" isn't really true, because the method does return a specific answer that implies a specific future leap schedule. From the point of view of the caller, it's guessing that there will never be any more leap seconds after the last one it knows about. If it signalled an error, that would constitute not guessing.

Also, "leap seconds in the future" reads as if it's referring to the future of when the method is called. This could do with some rewording in accordance with what we discussed upthread, to make clear that it may apply to leap seconds in the past of the call time. The true situation is somewhat implied by the subsequent discussion of "depending on the compiler version", but that comes across as conflicting with the "in the future" rather than as clarifying it.

Putting those together, I suggest that the first sentence of this doc section should read

The methods that involve knowledge of leap seconds always assume that there will be no further leaps after the last leap second that the implementation knows about, which may not be the last leap second that has actually been scheduled.

-zefram

p6rt commented 8 years ago

From zefram@fysh.org

zoffix@zoffix.com wrote:

Perhaps, we should evaluate some of the leap-second estimation algorithms.

If you like. To be clear, I'm not pushing for the conversion to use an estimation strategy per se, and we're now going beyond what's necessary to address my original bug report. Documenting the existing behaviour resolves the bug that I reported qua bug. I do still reckon the existing behaviour sucks, but with it as a documented API we're in the realm of differing judgements on a language design question, rather than a clear bug.

The preferred outcome for which I was pushing was to have either (or both, in separate methods) of the behaviours that I consider sensible: signalling an error or making a reasonable estimate. You've made clear that you're not at all a fan of estimation, and that's fine. It's totally compatible with my preferences, if one then accepts the conclusion that the conversion should signal an error. But I'm getting the impression that you don't find erroring very palatable either.

The starting point for leap estimation is the tidal braking effect by which the Moon is gradually slowing the rotation of the Earth, affecting the UT1\<->TT relationship. This is a long-term secular change, which therefore must be taken into account in order to make any reasonable estimate any significant number of years beyond one's present knowledge. There are several other effects on the Earth's rotation which affect UT1\<->TT, but they are all oscillations (on periods of a day up to decades), not secular drift. They can therefore be ignored, at least for an initial version and for our purposes quite likely forever, even though on the decadal timescale they swamp the tidal braking effect. To qualify as a reasonable estimate of UT1\<->TT it is both necessary and sufficient to account for recent length of day and tidal braking.

\http://www.ucolick.org/~sla/leapsecs/deltat.html has some nice plots showing differences between time scales. The first plot, with the 3000-year span of UT1\<->TT, is the most relevant to our situation. The roughly-quadratic curve there is what needs to be extrapolated. The way this historical information has been determined over such a span, extending way before mechanical time measurement, is pretty clever stuff: a written record that a solar eclipse was visible from a particular geographic location tells you which way the Earth was pointing (UT1) at a time (TT) that can be precisely determined by orbital calculations.

There have been many academic attempts to model and extrapolate this curve. They differ largely in how closely they attempt to model the last couple of centuries for which we have much more precise measurements. Any attempt to model variations on such a short timescale necessarily ends up modelling some of the oscillations, not just the secular trend, so ends up a lot more complicated. \http://www.ucolick.org/~sla/leapsecs/future2100.pdf is a nice plot comparing a variety of models against the eclipse observations. As you can see, there's quite a bit of disagreement between the models, and none of them is a great fit to the observations. But the models are of value: there's a rough agreement if one ignores the linear ones.

Excluding the linear models, it looks like none of them is compellingly superior to another for our purpose. Let's therefore take the simplest class of these models: a pure linear increase in length of day, giving a pure parabola of delta-T. Middle-of-the-road values to use are a LOD increase of 1.7 ms per century (astronomers use the mean Julian century, 36525 days), with LOD exactly equal to 86400 s at the year 1820. (1820 is the midpoint of the 1750-to-1890 span of the observations behind Simon Newcomb's theory of the planetary orbits, which is indirectly what fixed the length of the second as a modern unit of measurement.)

To fill out our model of projected UTC, let's presume that at the threshold date (the date of the next possible leap second not yet scheduled) UT1=UTC, then we'll graft onto that the remaining portion of the delta-T parabola. That gives us a model of TAI-UT1 for the future. Then let's suppose that each leap second happens at the end of the Gregorian month in which the fractional part of TAI-UT1 crosses 0.5. (Current practice is for leaps to happen only in June and December, but the rules allow the end of any month. Starting in the 38th century we require more than one leap per month; it's anybody's guess how UTC will actually be managed then, so it's not too bad to model it as a multi-second leap at the end of the month.)

The attached program implements this model, for bidirectional conversions between TAI and UTC. There is... an amount of support code. I had to import a bunch of fundamental Gregorian calendar stuff imitating my Perl 5 module Date::ISO8601. The actual leap second logic is only 80 lines in the middle of the 400 line file, and that covers exact conversions for the known schedule as well as the estimation for the unknown future. Each of the two regimes takes about half of the 80 lines. There's then a bunch of ISO 8601 text formatting and parsing code, which doesn't support the conversions themselves but is only used for the testing interface. Invoke like this:

$ perl6 utc_estimate.pl6 '2016-07-28T02:26:33 UTC' '2016-12-01T00:00:20.123 TAI' 2016-07-28T02:27:09 TAI = 2016-07-28T02:26:33 UTC 2016-12-01T00:00:20.123000 TAI = 2016-11-30T23:59:44.123000 UTC

Errors are checked everywhere they should be, but the error messages are not awesome. Conversions in both directions tick correctly through leap seconds, both real ones and guessed future ones:

2015-07-01T00:00:34 TAI = 2015-06-30T23:59:59 UTC 2015-07-01T00:00:35 TAI = 2015-06-30T23:59:60 UTC 2015-07-01T00:00:36 TAI = 2015-07-01T00:00:00 UTC 2017-12-01T00:00:36 TAI = 2017-11-30T23:59:59 UTC 2017-12-01T00:00:37 TAI = 2017-11-30T23:59:60 UTC 2017-12-01T00:00:38 TAI = 2017-12-01T00:00:00 UTC

That was satisfying to write.

There are more leap second problems in Rakudo besides the .from-posix,

Sure. Anything else using the leap second table runs into the same issues in some form.

that I now opened in https://rt-archive.perl.org/perl6/Ticket/Display.html?id=128752

As written, that ticket is about a more specific idea of exposing the leap second table explicitly. On its own that doesn't address the analogous issues for other uses of the table. I don't have very much opinion about exposing the table per se. If you do expose the current table, you'd probably want to expose a threshold-of-the-unknown date as well, because there's more to leap schedule knowledge than just the dates of actual leaps. If you expose it in writable form, I'd recommend writing via method rather than via lvalue, to avoid tying yourself to the table's current format.

-zefram

p6rt commented 8 years ago

From zefram@fysh.org

use v6;

# classes for representing times on particular time scales

class TaiTime { has FatRat:D $.linear1958 = 0.FatRat; method from_linear1958(TaiTime:U: FatRat:D $linear1958) { self.new(:$linear1958) } multi method perl(TaiTime:D:) { "{self.perl}.from_linear1958({$!linear1958.perl})" } method mjd(TaiTime:D:) { 36204 + ($!linear1958 / 86400) } method from_mjd(TaiTime:U: FatRat:D $mjd) { self.from_linear1958(($mjd - 36204) * 86400) } method mjdn(TaiTime:D:) { self.mjd.truncate } method mjdf(TaiTime:D:) { self.mjd % 1 } method mjdnf(TaiTime:D:) { my $mjd = self.mjd; return ($mjd.truncate, $mjd % 1); } method from_mjdnf(TaiTime:U: (Int:D $mjdn, FatRat:D $mjdf where $mjdf >= 0 && $mjdf \< 1)) { self.from_mjd($mjdn + $mjdf) } }

class UtcTime { has Int:D $.mjdn = 0; has FatRat:D $.mjdf = 0.FatRat; method mjdnf(UtcTime:D:) { ($!mjdn, $!mjdf) } method from_mjdnf(UtcTime:U: (Int:D $mjdn, FatRat:D $mjdf where $mjdf >= 0)) { self.new(:$mjdn, :$mjdf) } multi method perl(UtcTime:D:) { "{self.perl}.from_mjdnf(({$!mjdn.perl}, {$!mjdf.perl}))" } method mjd(UtcTime:D:) { $!mjdn + $!mjdf } method from_mjd(UtcTime:U: FatRat:D $mjd) { self.from_mjdnf(($mjd.truncate, $mjd % 1)) } }

# Gregorian calendar and 24-hour clock

sub year_leap(Int:D $y) { $y %% 4 && ($y !%% 100 || $y %% 400) }

sub year_days(Int:D $y) { year_leap($y) ?? 366 !! 365 }

my @month_length = (31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31); sub month_days(Int:D $y, Int:D $m) { if $m == 2 { return year_leap($y) ?? 29 !! 28; } else { return @month_length[$m - 1]; } }

my @nonleap_monthstarts = (0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365); my @leap_monthstarts = (0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335, 366); sub year_monthstarts(Int:D $y) { year_leap($y) ?? @leap_monthstarts !! @nonleap_monthstarts }

sub ymd_from_mjdn(Int:D $mjdn) { my $d = $mjdn - -678941; my $qc = $d div (365 * 400 + 97); $d %= (365 * 400 + 97); my $y = $d div 366; my $leaps = ($y + 3) div 4; $leaps -= ($leaps - 1) div 25 unless $leaps == 0; $d -= 365 * $y + $leaps; my $yd = year_days($y); if $d >= $yd { $d -= $yd; $y++; } $d++; my @monthstarts = year_monthstarts($y); my $m = 1; while $d > @monthstarts[$m] { $m++; } return ($qc * 400 + $y, $m, $d - @monthstarts[$m - 1]); }

sub ymd_to_mjdn((Int:D $y, Int:D $m where $m >= 1 && $m \<= 12, Int:D $d where $d >= 1)) { my @monthstarts = year_monthstarts($y); my $md = @monthstarts[$m] - @monthstarts[$m - 1]; $d \<= $md or die "day number out of range"; my $dd = @monthstarts[$m - 1] + $d - 1; my $qc = $y div 400; my $yy = $y % 400; my $leaps = ($yy + 3) div 4; $leaps -= ($leaps - 1) div 25 unless $leaps == 0; return -678941 + (365 * 400 + 97) * $qc + 365 * $yy + $leaps + $dd; }

sub hms_from_mjdf(FatRat:D $mjdf where $mjdf >= 0) { my $hf = $mjdf * 24; my $h = $hf.truncate; if $h >= 24 { return (23, 59, ($mjdf * 86400) - 86340); } else { my $mf = ($hf % 1) * 60; my $m = $mf.truncate; return ($h, $m, ($mf % 1) * 60); } }

sub hms_to_mjdf((Int:D $h where $h >= 0 && $h \<= 23, Int:D $m where $m >= 0 && $m \<= 59, FatRat:D $s where $s >= 0)) { ($h == 23 && $m == 59) || $s \< 60 or die "seconds out of range"; return ($h * 60 + $m).FatRat / 1440 + $s / 86400; }

# leap second processing

my @known_leaps = \< 1972-06 1972-12 1973-12 1974-12 1975-12 1976-12 1977-12 1978-12 1979-12 1981-06 1982-06 1983-06 1985-06 1987-12 1989-12 1990-12 1992-06 1993-06 1994-06 1995-12 1997-06 1998-12 2005-12 2008-12 2012-06 2015-06 2016-12

; my $next_possible_leap = "2017-06";

sub leap_month_to_mjdn(Str:D $str) { (my $y, my $m) = $str.split("-").map({ .Int }); $m++; if $m == 13 { $y++; $m = 1; } return ymd_to_mjdn(($y, $m, 1)); }

sub leap_month_to_tai(Str:D $str, Int:D $dtai) { return TaiTime.from_mjd(leap_month_to_mjdn($str) + FatRat.new($dtai, 86400)); }

my @utc_segs = ({ start_utc_mjdn => leap_month_to_mjdn("1971-12"), start_tai => leap_month_to_tai("1971-12", 10).linear1958, dtai => 10, },); for @known_leaps { my $dtai = @utc_segs[*-1]\ + 1; my $bound = leap_month_to_tai($_, $dtai).linear1958; @utc_segs[*-1]\<end_tai> = $bound; my $end_utc_mjdn = leap_month_to_mjdn($_); @utc_segs[*-1]\<end_utc_mjdn> = $end_utc_mjdn; @utc_segs.push: { start_utc_mjdn => $end_utc_mjdn, start_tai => $bound, dtai => $dtai, }; } @utc_segs[*-1]\<end_tai> = leap_month_to_tai($next_possible_leap, @utc_segs[*-1]\).linear1958; @utc_segs[*-1]\<end_utc_mjdn> = leap_month_to_mjdn($next_possible_leap);

my $lod_nominal_time = leap_month_to_tai("1819-12", -20); my $lod_increase_rate = 0.0017; my $xlod_at_threshold = $lod_increase_rate * (@utc_segs[*-1]\<end_tai> - $lod_nominal_time.linear1958) / (86400*36525);

sub tai-ut1_for_tai(FatRat:D $lin) { my $tdiffdays = ($lin - @utc_segs[*-1]\<end_tai>) / 86400; return @utc_segs[*-1]\ + $xlod_at_threshold * $tdiffdays + $lod_increase_rate * $tdiffdays * $tdiffdays / (2 * 36525); }

sub dtai_for_month(Int:D $mjdn) { my $dtai = 0; loop { my $lin = TaiTime.from_mjdnf(($mjdn, 0.FatRat)).linear1958 + $dtai; my $tai-ut1 = tai-ut1_for_tai($lin); my $new_dtai = ($tai-ut1 + 0.5).truncate; return $dtai if $new_dtai == $dtai; $dtai = $new_dtai; } }

sub mjdn_start_month(Int:D $mjdn) { my $ymd = ymd_from_mjdn($mjdn); return ymd_to_mjdn(($ymd[0], $ymd[1], 1)); }

sub mjdn_next_month(Int:D $mjdn) { my $ymd = ymd_from_mjdn($mjdn); return $mjdn + month_days($ymd[0], $ymd[1]); }

sub utc_from_tai_best_guess(TaiTime:D $tai) { my $lin = $tai.linear1958; if $lin \< @utc_segs[0]\<start_tai> { die "can't handle time prior to leap-seconds UTC"; } elsif $lin \< @utc_segs[*-1]\<end_tai> { my $l = 0; my $r = @utc_segs.elems - 1; until $r == $l { my $t = ($l + $r) +> 1; if $lin \< @utc_segs[$t]\<end_tai> { $r = $t; } else { $l = $t + 1; } } my $dtai = @utc_segs[$l]\; my $utc_mjd = $tai.mjd - FatRat.new($dtai, 86400); my $utc_mjdnf = ($utc_mjd.truncate, $utc_mjd % 1); if $utc_mjdnf[0] == @utc_segs[$l]\<end_utc_mjdn> { $utc_mjdnf = ($utc_mjdnf[0] - 1, $utc_mjdnf[1] + 1); } return UtcTime.from_mjdnf($utc_mjdnf); } else { my $tai-ut1 = tai-ut1_for_tai($lin); my $rough_mjd = $tai.mjd - FatRat.new(($tai-ut1 + 1).truncate, 86400); my $rough_mjdn = $rough_mjd.truncate; my $m0_mjdn = mjdn_start_month($rough_mjdn); my $m0_dtai = dtai_for_month($m0_mjdn); my $m0_start = TaiTime.from_mjd($m0_mjdn + FatRat.new($m0_dtai, 86400)).linear1958; loop { my $m1_mjdn = mjdn_next_month($m0_mjdn); my $m1_dtai = dtai_for_month($m1_mjdn); my $m1_start = TaiTime.from_mjd($m1_mjdn + FatRat.new($m1_dtai, 86400)).linear1958; if $lin \< $m1_start { my $utc_mjd = $tai.mjd - FatRat.new($m0_dtai, 86400); my $utc_mjdnf = ($utc_mjd.truncate, $utc_mjd % 1); while $utc_mjdnf[0] >= $m1_mjdn { $utc_mjdnf = ($utc_mjdnf[0] - 1, $utc_mjdnf[1] + 1); } return UtcTime.from_mjdnf($utc_mjdnf); } $m0_mjdn = $m1_mjdn; $m0_dtai = $m1_dtai; $m0_start = $m1_start; } } }

sub utc_to_tai_best_guess(UtcTime:D $utc) { my $mjdnf = $utc.mjdnf; if $mjdnf[0] \< @utc_segs[0]\<start_utc_mjdn> { die "can't handle time prior to leap-seconds UTC"; } elsif $mjdnf[0] \< @utc_segs[*-1]\<end_utc_mjdn> { my $l = 0; my $r = @utc_segs.elems - 1; until $r == $l { my $t = ($l + $r) +> 1; if $mjdnf[0] \< @utc_segs[$t]\<end_utc_mjdn> { $r = $t; } else { $l = $t + 1; } } my $dtai = @utc_segs[$l]\; if $mjdnf[1] >= 1 && $mjdnf[0] != @utc_segs[$l]\<end_utc_mjdn> - 1 { die "given UTC time does not exist"; } my $tai = TaiTime.from_mjd($mjdnf[0] + FatRat.new($dtai, 86400) + $mjdnf[1]); $tai.linear1958 \< @utc_segs[$l]\<end_tai> or die "given UTC time does not exist"; return $tai; } else { my $m0_mjdn = mjdn_start_month($mjdnf[0]); my $m0_dtai = dtai_for_month($m0_mjdn); my $m1_mjdn = mjdn_next_month($m0_mjdn); if $mjdnf[1] >= 1 && $mjdnf[0] != $m1_mjdn - 1 { die "given UTC time does not exist"; } my $m1_dtai = dtai_for_month($m1_mjdn); my $m1_start = TaiTime.from_mjd($m1_mjdn + FatRat.new($m1_dtai, 86400)).linear1958; my $tai = TaiTime.from_mjd($mjdnf[0] + FatRat.new($m0_dtai, 86400) + $mjdnf[1]); $tai.linear1958 \< $m1_start or die "given UTC time does not exist"; return $tai; } }

# ISO 8601 string format

my regex decdig { \<[0123456789]> }

sub fatrat_from_str(Str:D $str) { /^(\+)(\.(\+))?$/.ACCEPTS($str) or die "malformed numeric string"; return FatRat.new(($0.Str ~ ($1 =:= Nil ?? "" !! $1[0].Str)).Int, 10 ** ($1 =:= Nil ?? 0 !! $1[0].chars)); }

sub iso8601_from_y(Int:D $y) { sprintf($y \< 0 || $y > 9999 ?? "%+05d" !! "%04d", $y) }

sub iso8601_to_y(Str:D $str) { /^ ( \<[-+]> \**4..* || \**4 ) $/.ACCEPTS($str) && !/^\-0+$/.ACCEPTS($str) or die "malformed year string"; return $str.Int; }

sub iso8601_from_ymd((Int:D $y, Int:D $m, Int:D $d)) { iso8601_from_y($y) ~ sprintf("-%02d-%02d", $m, $d) }

sub iso8601_to_ymd(Str:D $str) { /^(\<[-+]>?\+)\-(\**2)\-(\**2)$/.ACCEPTS($str) or die "malformed date string"; return (iso8601_to_y($0.Str), $1.Int, $2.Int); }

sub iso8601_from_hms((Int:D $h, Int:D $m, FatRat:D $s)) { my $res = sprintf("%02d:%02d:%02d", $h, $m, $s.truncate); my $f = $s % 1; if $f != 0 { my $us = $f * 1000000; $res ~= sprintf(".%06d", $us.truncate); $us %% 1 or $res ~= "~"; } return $res; }

sub iso8601_to_hms(Str:D $str) { /^(\**2)\:(\**2)\:(\**2 (\.\+)?)$/\ .ACCEPTS($str) or die "malformed time string"; return ($0.Int, $1.Int, fatrat_from_str($2.Str)); }

sub iso8601_from_mjdn(Int:D $mjdn) { iso8601_from_ymd(ymd_from_mjdn($mjdn)) }

sub iso8601_to_mjdn(Str:D $str) { ymd_to_mjdn(iso8601_to_ymd($str)) }

sub iso8601_from_mjdf(FatRat:D $mjdf where $mjdf >= 0) { iso8601_from_hms(hms_from_mjdf($mjdf)) }

sub iso8601_to_mjdf(Str:D $str) { hms_to_mjdf(iso8601_to_hms($str)) }

sub iso8601_from_mjdnf((Int:D $mjdn, FatRat:D $mjdf where $mjdf >= 0)) { iso8601_from_mjdn($mjdn) ~ "T" ~ iso8601_from_mjdf($mjdf) }

sub iso8601_to_mjdnf(Str:D $str) { /^ ((\||\<[-+]>)+) \<[tT]> ((\||\<[:.]>)+) $/.ACCEPTS($str) or die "malformed date/time string"; return (iso8601_to_mjdn($0.Str), iso8601_to_mjdf($1.Str)); }

# main program for test

sub mjdnf_from_str(Str:D $str) { if /^\+(\.\+)?$/.ACCEPTS($str) { my $mjd = fatrat_from_str($str); return ($mjd.truncate, $mjd % 1); } else { return iso8601_to_mjdnf($str); } }

for @*ARGS { /^(\<-[\x20]>+) ' ' (:i (tai|utc))$/.ACCEPTS($_) or die "bad argument"; my $mjdnf = mjdnf_from_str($0.Str); my $tai; my $utc; if /^(:i tai)$/.ACCEPTS($1.Str) { $tai = TaiTime.from_mjdnf($mjdnf); $utc = utc_from_tai_best_guess($tai); } else { $utc = UtcTime.from_mjdnf($mjdnf); $tai = utc_to_tai_best_guess($utc); } say "{iso8601_from_mjdnf($tai.mjdnf)} TAI = " ~ "{iso8601_from_mjdnf($utc.mjdnf)} UTC"; }

p6rt commented 8 years ago

From zefram@fysh.org

I was expecting this ticket to yield some statement about the design objectives of Instant.from-posix() and the related leap second code, but that hasn't happened yet, and it's beginning to look as though there isn't any firm objective. So I think it might be helpful to lay out the problem space.

The underlying question to ask when designing this kind of API is what class of use case it's trying to satisfy. There may be multiple use cases of interest -- there certainly will be over the module ecosystem as a whole -- and there may be a need for multiple versions of TAI\<->UTC conversion to satisfy all the ones we're concerned with. For each use case we can look at what kind of requirements it places on the conversion functions, and for each possible conversion function we can look at what kind of requirements it can satisfy.

Given the unavoidable split in TAI\<->UTC conversion, between the known and unknown regions of the leap schedule, each possible conversion function is going to have two distinct behaviours. A caller whose interests span both regions will get either behaviour, generally not knowing which it will get. For the caller to be satisfied by a conversion function, therefore, its needs must be satisfied by each behaviour individually. Conversely, a conversion function only really provides those guarantees that are common to both of its behaviours. It's a weakest-link deal.

Let's look at what the various discussed conversion semantics actually provide in this respect, to a caller spanning both regions:

A: correct answers for the known region, error for the unknown region. Doesn't guarantee to produce an answer, but does guarantee that any answer produced will be correct.

B: correct answers for the known region, estimate for the unknown region. Guarantees to produce a plausible estimate, not necessarily correct.

C: correct answers for the known region, presumption of no leap seconds in the unknown region (current behaviour). Guarantees to produce some answer, but with no guarantee of quality, the answer may be garbage.

You can see why I say that the current behaviour (C) sucks. It doesn't seem at all useful for any caller that might run into the unknown region.

But let's more rigorously look at this from the point of view of caller use cases. I see these possibilities for callers' requirements:

0. require correct answers, only operating on times up to 2015. This can be satisfied by a historical leap schedule baked into the implementation, and so is satisfied by any version that we've discussed, if the input is definitely so limited. To avoid accidents, however, the caller would probably like some checking that a correct answer can actually be produced, with error signalling where it can't (behaviour A).

1. require correct answers, only operating on times for which the leap schedule has been determined by the time the call is made (so only operating up to a few weeks into the future). This is not satisfied by a schedule baked into the implementation, but can in principle be satisfied by downloading more schedule at runtime. We haven't discussed this here, but [perl #128752] has touched on it. As with case 0, error signalling where the input exceeds the intended limits is desirable.

2. require correct answers, including well into the future. This cannot be satisfied by any means short of waiting until the times of interest are no longer far in the future. Any application with this requirement has a serious design problem, which cannot be solved by any kind of cleverness in its libraries. It has to be addressed by redesigning the application.

3. require answers to be correct, but don't need to always get an answer. This requires basically conversion behaviour A, erroring on the unknown region. Optionally there could be some downloading of new leap schedule beyond what's baked into the implementation, but erroring is definitely required when that is exceeded.

4. require answers to be correct where the implementation can easily do that, and otherwise require answers consistent with what other versions of the code produce. Since some future version of the implementation will know the real leap schedule for whatever time is being asked about, being consistent with that requires producing the correct answer for all times, including those years in the future. This therefore reduces to the impossible case 2.

5. require plausible estimates. This requires basically conversion behaviour B. As with case 3, there could optionally be some downloading of new leap schedule, but that can be exceeded and so some estimation behaviour is necessary. Strictly speaking it's not necessarily required to use the actual historical leap schedule at all, but the plausibility of answers that contradict the history is low.

6. need to get answers, but have no quality requirement on what the answers are. This doesn't require the use of any estimation for the unknown region, and equally doesn't require the use of any historical leap schedule for the known region. This case can be satisfied by much simpler code that doesn't know anything about leap seconds. An application declaring this requirement isn't really requiring any form of TAI\<->UTC conversion, and would be better off using its TAI or UTC times unconverted, rather than pretending that it's doing a conversion.

So I see needs for conversion behaviours A and B, but behaviour C is overcomplicated for the only use case that it really satisfies (6).

Over to you: what use cases are Instant.from-posix() and friends intended to satisfy?

-zefram

Raku / old-issue-tracker