unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.
https://icu4x.unicode.org
Other
1.29k stars 165 forks source link

Incorrect start date for Meiji era #4892

Open anba opened 1 month ago

anba commented 1 month ago

The start date of the Meiji era is incorrectly reported as September 8, 1868 (Gregorian calendar). I guess the start date from CLDR is incorrectly assumed to be in the Gregorian calendar, even though in 1868 the Chinese lunar calendar was still in use in Japan.

(Tested with release 1.4.)

use icu::calendar::Date;
use icu::calendar::{Calendar, Ref};
use icu::calendar::japanese::Japanese;

fn main() {
  let cal = Japanese::new();
  let cal = Ref(&cal);

  // Meiji era started October 23, 1868. [1]
  //
  // October 23, 1868 is Meiji 1, 8th day of the 9th month in the Japanese
  // calendar.
  //
  // The seireki system was introduced in 1873, so after the start of the Meiji
  // era. [2]
  //
  // January 25, 1868 is possibly also a valid start date of the era [3].
  //
  // [1] https://en.wikipedia.org/wiki/Meiji_(era)
  // [2] https://en.wikipedia.org/wiki/Japanese_calendar#Gregorian_Calendar_(seireki)
  // [3] https://en.wikipedia.org/wiki/Kei%C5%8D#Events_of_the_Kei%C5%8D_era

  let date_iso = Date::try_new_iso_date(1868, 9, 8).unwrap();
  let date = Date::new_from_iso(date_iso, cal);

  // Era incorrectly reported as "meiji", but should instead be "ce".
  dbg!(date.year().era);

  let date_iso = Date::try_new_iso_date(1868, 9, 7).unwrap();
  let date = Date::new_from_iso(date_iso, cal);

  // Era correctly reported as "ce".
  dbg!(date.year().era);
}
sffc commented 1 month ago

UTS 35 says:

Era start or end dates are specified in terms of the equivalent proleptic Gregorian date (in "y-M-d" format). Eras may be open-ended, with unspecified start or end dates.

https://cldr-smoke.unicode.org/spec/main/ldml/tr35-dates.html#Supplemental_Calendar_Data

The CLDR data:

              <era type="232" start="1868-9-8" code="meiji"/>

So this might be an issue in CLDR.

sffc commented 1 month ago

There is already a CLDR issue about it:

https://unicode-org.atlassian.net/browse/CLDR-11375