BurntSushi / jiff

A date-time library for Rust that encourages you to jump into the pit of success.
The Unlicense
1.63k stars 25 forks source link

Fraction computation with large times produces strange timestamps which are not printed sanely #59

Closed addisoncrump closed 1 month ago

addisoncrump commented 1 month ago

The following testcases trigger the observed behaviour:

pT026733320132.65M
PT834662292.2H
PT22444444452,6m
Pt843517081,1H

Looking a bit further, it seems that the routine for parsing fractional time is a bit strange in the presence of large values. Specifically (for the case of minutes), this code region:

if minutes > t::SpanMinutes::MAX_SELF {
    nanos +=
        (minutes - t::SpanMinutes::MAX_SELF) * t::NANOS_PER_MINUTE;
    minutes = t::NoUnits128::rfrom(t::SpanMinutes::MAX_SELF);
}

This code is explained by the following text:

The above is why we have `if unit_value > MAX { <do adjustments> }` in
the balancing code below. Basically, if we overshoot our limit, we back
out anything over the limit and carry it over to the lesser units. If
our value is truly too big, then the final call to set nanoseconds will
fail.

It is unclear to me why the units would be backfilled to the smallest unit. Shouldn't this be implemented in the other direction, preferring larger units for larger values?

Regardless, the result is that the nanos value is set to something large. For the first testcase, the resulting computed timestamp is printed as PT10518456960m972891790359s. This value is correct, but not parseable because the seconds unit is out of range:

>>> int(26733320132.65*60)
1603999207959
>>> 10518456960*60+972891790359
1603999207959

Note that in memory, this value is actually:

Span { minutes: 10518456960, seconds: 631107417600, milliseconds: 341784372759000, .. }

So it would appear that this issue is in part induced by the printer implementation.

This was caught by #57 because the printer emitted invalid timestamps.

addisoncrump commented 1 month ago

To be clear, I think part of the problem here is that the printed duration now reads quite strangely. 10518456960 minutes AND 972891790359s? This is quite unintuitive, since I would never expect to need to compute the actual number of minutes based on the number of seconds as well. In other words, I think it is confusing that the seconds value ever is greater than or equal to 60 when minutes are presented, and similarly for other units.

BurntSushi commented 1 month ago

Catching up here, here is a concrete program that fails that I believe should succeed:

use jiff::{Span, Unit};

fn main() -> anyhow::Result<()> {
    let span1: Span = "Pt843517081,1H".parse()?;
    let iso = span1.to_string();
    eprintln!("span1 serialized: {iso:?}");
    let span2: Span = iso.parse()?;
    assert_eq!(span1.total(Unit::Hour)?, span2.total(Unit::Hour)?);

    Ok(())
}

Output:

$ cargo -q r
span1 serialized: "PT175307616h10518456960m1774446656760s"
Error: failed to parse ISO 8601 duration string into `Span`: failed to set value 1774446656760 as second unit on span: parameter 'seconds' with value 1774446656760 is not in the required range of -631107417600..=631107417600

So this looks to me like there is a real bug here. Nice find! I'll investigate more later.

But yes, there are some shenanigans happening here:

But I can't investigate more closely right now. I'll do so later.