unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.
https://icu4x.unicode.org
Other
1.39k stars 178 forks source link

Debug assertion failures with ECMA-262 Temporal.PlainDate minimum and maximum values #4917

Open anba opened 6 months ago

anba commented 6 months ago

There are assertion failures when testing the minimum and maximum allowed Temporal.PlainDate values for the Chinese, Dangi, IslamicObservational, and IslamicUmmAlQura calendars.

I don't know if #4904 will fix these issues, too.

(Tested with release 1.4.)

use icu::calendar::Date;
use icu::calendar::Calendar;

use icu::calendar::chinese::Chinese;
use icu::calendar::dangi::Dangi;
use icu::calendar::islamic::{IslamicObservational, IslamicUmmAlQura};

fn main() {
  // Minimum and maximum dates allowed in ECMA-262 Temporal.
  let min_date_iso = Date::try_new_iso_date(-271821, 4, 19).unwrap();
  let max_date_iso = Date::try_new_iso_date(275760, 9, 13).unwrap();

  // Assertion failure:
  // > Month should be in range of u8! Value -1428 failed for RD RataDie(-99280838)
  {
    let cal = Chinese::new_always_calculating();
    let min_date = cal.date_from_iso(min_date_iso);
    let max_date = cal.date_from_iso(max_date_iso);

    println!("min.year = {}, max.year = {}", cal.year(&min_date).number, cal.year(&max_date).number);
  }

  // Assertion failure:
  // > Month should be in range of u8! Value -1428 failed for RD RataDie(-99280838)
  {
    let cal = Dangi::new_always_calculating();
    let min_date = cal.date_from_iso(min_date_iso);
    let max_date = cal.date_from_iso(max_date_iso);

    println!("min.year = {}, max.year = {}", cal.year(&min_date).number, cal.year(&max_date).number);
  }

  // Assertion failure:
  // > assertion failed: Date::try_new_observational_islamic_date(y, m, d, IslamicObservational::new_always_calculating()).is_ok()
  {
    let cal = IslamicObservational::new_always_calculating();
    let min_date = cal.date_from_iso(min_date_iso);
    let max_date = cal.date_from_iso(max_date_iso);

    println!("min.year = {}, max.year = {}", cal.year(&min_date).number, cal.year(&max_date).number);
  }

  // Assertion failure:
  // > assertion failed: Date::try_new_ummalqura_date(y, m, d, IslamicUmmAlQura::new_always_calculating()).is_ok()
  {
    let cal = IslamicUmmAlQura::new_always_calculating();
    let min_date = cal.date_from_iso(min_date_iso);
    let max_date = cal.date_from_iso(max_date_iso);

    println!("min.year = {}, max.year = {}", cal.year(&min_date).number, cal.year(&max_date).number);
  }
}
anba commented 5 months ago

Still reproduces with release 1.5, except that the assertions are changed:

Chinese/Dangi:

calendrical_calculations-0.1.1/src/chinese_based.rs:571:9:
assertion failed: diff == 29 || diff == 30

IslamicObservational:

calendrical_calculations-0.1.1/src/islamic.rs:92:21:
(ObservationalIslamic) Found year -280804 AH with month length 26 for month 1!

IslamicUmmAlQura:

calendrical_calculations-0.1.1/src/islamic.rs:59:17:
(SaudiIslamic) Found year -280804 AH with length 319!
sffc commented 5 months ago

Yeah, I managed to get these to work for much larger ranges (thousands of years) in #4904, but dates that far away (hundreds of thousands of years) still have precision issues.

A few potential paths forward:

  1. Implement more workarounds similar to those in #4904 but for the full Temporal range. This would basically mean constructing year/month cutoffs that conform to expectations (29/30 day months) but aren't based on the calendrical calculations directly.
  2. Make these calendars fall back to Gregorian years and eras beyond a certain range, such as ±10000 years
  3. Make the Islamic calendars fall back to Civil outside the exact range, and make Chinese fall back to some algorithmic metonic cycle approximation outside the exact range
sffc commented 5 months ago

@Manishearth @roozbehp

Manishearth commented 5 months ago

Can we have these assertions not fail outside of ranges where we expect the calendar to be well-behaved?

sffc commented 5 months ago

Start time: 11:05

Options:

  1. Try to make calendrical calculations work for large years
  2. Switch to Gregorian for out of bounds dates
  3. Switch to some hacky metonic cycle 3a. Do this in a way that no longer requires runtime code (ship more data + metonic hacks) . This would ship all positive Chinese/Hijri dates up to 1k years in the future, and after that use some metonic cycle.
  4. Turn off internal calendar invariants for large years. Declare that the calendar does not have Temporal continuity invariants beyond a certain range, and fail tests there. Ask temporal to change the spec.

Proposal: Option 3 or 3a with data size bikeshed tbd after implementation. Temporal issues to be filed upstream motivated by option 4. Ask for Temporal to not be handwavy about this.

Signoff: @sffc @manishearth @nekevss @echeran (@hsivonen?) (@robertbastian?)

Manishearth commented 5 months ago

Manish to file Temporal issue.

sffc commented 5 months ago

Related to option 3a: might be worth investigating whether Observational and Saudi islamic could share the data cache. Maybe not, because we're already extremely efficient, with 2 bytes per year per calendar.

Manishearth commented 5 months ago

I think that's unlikely but it's possible that there's an efficient way to represent their overlap.

sffc commented 5 months ago

Thanks @Manishearth for filing https://github.com/tc39/proposal-temporal/issues/2869

I found the Temporal invariants that lead to my position that we should enforce certain arithmetic behaviors in calendars we export. See my comment in the above issue.

sffc commented 5 months ago

@hsivonen - What does ICU4C do? @Manishearth - ICU4C doesn't have these assertions. I imagine it might fail the tests.

Evidence that ICU4C has similar problems but just doesn't have code assertions:

https://unicode-org.atlassian.net/browse/ICU-10866

anba commented 2 months ago

The assertion Year offset too big to store can also be triggered when using years too far into the future or past:

let cal = Chinese::new_always_calculating();
let era = Era::from_str("chinese").unwrap();
let month_code = MonthCode::from_str("M01").unwrap();
let date = cal.date_from_codes(era, 21206, month_code, 1).unwrap();