Open sffc opened 1 year ago
Eras should define two things:
- An epoch day
- A formula for converting from the (eraYear, monthCode, day) triple to a number of days relative to that epoch
I'd rather have a definition that is less tied to the particular variant, perhaps on the lines of the following. It is motivated by the sharing of era starts among islamic calendars. That is, as I recall, islamic, islamic-civil, islamic-rgsa, islamic-tbla, islamic-umalqura all have the same eras, so the identifier should be the same.
Discussion with @louis-aime @manishearth
"bc"
can be used to construct a date in the gregorian calendarfor 'roc' calendar, you need to have two different era code, one for "roc" era and one for pre roc era. the mapping is
Gregorian year | era in "roc" calendar | eraYear in "roc" calendar |
---|---|---|
1909 | "pre-roc" | 3 |
1910 | "pre-roc" | 2 |
1911 | "pre-roc" | 1 |
1912 | "roc" | 1 |
1913 | "roc" | 2 |
... | ||
2022 | "roc" | 111 |
2023 | "roc" | 112 |
There is existing data in:
We could encode this in a way such as
<calendar type="ethiopic">
<calendarSystem type="other"/>
<eras>
<era type="0" end="8-08-28" name="ethioaa" aliases="mundi"/>
<era type="1" start="8-08-29" name="ethiopic" aliases="incar incarnation"/>
</eras>
</calendar>
<calendar type="ethiopic-amete-alem">
<eras>
<era type="0" end="-5492-08-29" name="ethioaa" aliases="mundi"/>
</eras>
<!-- Not sure if we want this: -->
<eraInputs>
<eraInput name="ethiopic"/>
</eraInputs>
</calendar>
For Japanese:
<calendar type="japanese">
<calendarSystem type="solar" />
<eras>
<era type="237" end="0-12-31" />
<era type="238" start="1-01-01" />
<era type="0" start="645-6-19"/>
<era type="1" start="650-2-15"/>
<era type="2" start="672-1-1"/>
<!-- ... -->
<era type="235" start="1989-1-8"/>
<era type="236" start="2019-5-1"/>
</eras>
</calendar>
Need to check if this is well-formed. Constraints we're aware of:
Also we tentatively settled on all calendars having the same set of eras for input and output, except that input admits aliases (so, input accepts more codes, but the same underlying eras).
For example, the "ethiopic" era is not permitted as an input to the "ethioaa" calendar (even though the "ethioaa" era will be one of the allowed eras as input to "ethiopic").
Furthermore, input will accept out-of-range values and normalize them (e.g. gregory 2020
in the Japanese calendar is accepted and normalized to reiwa 2
)
The cases where a calendar will use eras named by another calendar are:
There was some disagreement regarding BCP-47 in today's CLDR call. The committee generally feels that "if we can make the canonical IDs BCP-47, we should." @FrankYFTang pointed out that if the Japanese emperor creates an era with a single Kanji, it might be only 2 letters long and not BCP-47. In my opinion, in the fairly unlikely case this happens, the problem is easy enough to solve by fiddling with the identifier, like padding it with 0s, and the unpadded version can be added as an alias, which doesn't need to be BCP-47. Note that the japanese era code doc proposed era codes like "showa-1312", which is BCP-47-friendly. An interesting question is whether to keep the year always at 4 digits or let it slide down to 3 digits (which is technically BCP-47 but maybe undesirable from that standpoint).
(not such that it blocks any changes here) is it entirely outside the realm of possibility for Ecma to send a formal request to the Japanese emperor to ensure that future eras conform to BCP-47?
I just discovered this repo. Glad we're specifying this stuff! A few notes:
Eras should define two things:
- An epoch day
- A formula for converting from the (eraYear, monthCode, day) triple to a number of days relative to that epoch
In the current Temporal polyfill, there is additional metadata for each era which may (or may not) be relevant to the work here in this repo, so sharing here in case it's useful:
buddhist
in CLDR, but if the Hindu calendar is supported later then it also has a year zero). bce
era)eraYear
and year
.eraYear is permitted to be negative.
Do you mean "permitted for input"? "Exposed as eraYear
property? Something else?
Also, what does it mean to provide a negative eraYear
for eras that count time backwards? Is {era: 'bce', eraYear: -10}
a valid input? Regardless of the answer, this should be specified.
Eras should be named as follows:
- For eras counting forwards: the BCP-47 ID of the canonical calendar for that era
- For eras counting backwards:
pre-
followed by the BCP-47 ID
I assume that bce
is an exception to this rule?
*2 Dates in the incarnation era are expressed with ethiopic; dates prior to that are expressed in ethioaa. Dates prior to the creation of the earth are expressed as negative numbers in ethioaa.
Probably needs a bit more explanation for this calendar.
Related: https://github.com/tc39/ecma402/issues/534 (which should be migrated to this repo?)
*3 Note that input dates can be labeled as either
gregory
orjulian
and they will be interpreted correctly, even if the change date is unknown by the caller.
What will be the anchor era for this calendar? In other words, for {year: 1, month: 1, day: 1}
, which era will be { eraYear: 1, month: 1, day: 1 }
on the same day?
Also, when formatting a localized date in this calendar, how will it know which era to use in the output? Will you be adding an option to the DateTimeFormat constructor to specify the switchover date? Will the switchover date be inferred by locale?
And is there a plan to also have a plain julian
calendar too? Or just julian-gregory
?
If there's a separate issue or proposal about Julian calendars, what is it?
iso8601
iso8601
*4
Interesting. So you're proposing to support eras for all calendars, not just those that actually use eras? (Meaning they have more than one.) I can see pros and cons of this approach, but I'm curious to hear from you about what you think.
Also, it sounds like you're proposing that calendars that don't use eras will have a single era whose name matches the calendar's name. Is that correct? If so, they you may want to simplify your table to just list the calendars that use eras, and note that the other calendars just have a single era with the calendar name.
BTW, if every calendar uses eras then presumably era
/eraYear
processing will be identical in 262 vs. 402 which might simplify the spec a bit.
Calendar ID Eras
Where are era aliases specified? Should these be included in this table?
There are 2 types of calendars with eras: those with a variable number of eras, and those with a fixed number. Japanese is the one with a variable number of eras, and Manish has already written a nice spec for that. Therefore, in the rest of this post, I am focused on calendars with a fixed number of eras.
This is a good way to differentiate. Another important difference: Japanese is the only CLDR calendar with more than 2 eras. So "fixed" here doesn't only mean "doesn't change", it also means 1 or 2.
A few more things:
Will there be an enumeration API to find the eras for each calendar? If yes:
*3 Note that input dates can be labeled as either
gregory
orjulian
and they will be interpreted correctly, even if the change date is unknown by the caller.
This calendar is interesting because eras are overlapping. In all other calendars, the first day of an era one day later than the last day of previous eras. But not so for julian
vs. gregory
eras. Ditto for the epoch date of eras: in all other calendars, the epoch date is fixed, but in this calendar it's necessarily moveable. What implications for userland code (and for implementations?) arise from breaking those two otherwise invariants?
It also made me wonder if the gregory
calendar should accept a julian
era for input as well.
{ era: "bce", eraYear: -10 }
should be equivalent to { era: "ce", eraYear: 9 }
The era code bce
is being proposed as an alias to the canonical name pre-gregory
The julian-gregory calendar is not specified yet, and won't be specified in the initial release, but my initial reaction is that:
julian-gregory-16500101
for a hypothetical 1650-01-01 switchover dateI think there won't be julian
because it is not in CLDR; there is coptic
instead.
I gave every CLDR calendar gets an identity era. I included iso8601 because it was in the CLDR table, but we could say that iso8601 is the exception. Actually we could say that cyclic calendars are an exception, too.
Haven't thought about era code enumerations.
Did I leave any loose ends?
The era code
bce
is being proposed as an alias to the canonical namepre-gregory
I empathize with the goal of consistency, but for a feature like eras that most developers will almost never use, it seems better to go with more familiar names for canonical eras. This will make code more self-describing so that developers won't have to open up MDN to figure out what's going on in code they're reading.
For example, imagine an educational web app that displays BCE dates in a different color so that they're not easily confused by its high-school-aged users. The following code is probably easy to understand for most programmers:
const date = originalDate.withCalendar('gregory');
const dateColor = date.era === 'bce' ? 'red' : 'black';
However the following code will probably require most developers to read the docs to figure out what's going on.
const date = originalDate.withCalendar('gregory');
const dateColor = date.era === 'before-gregory' ? 'red' : 'transparent';
Does 'before-gregory'
mean "not Julian"? Something else? Is the year 400 AD "before-gregory" because it's before the Gregorian transition?
Here's a suggestion for another way to think about naming canonical era identifiers:
Canonical identifiers for eras should follow the following guidelines:
gregory
calendar, bce
and ce
should be canonical because the former is more commonly used in academic and scientific writing and the acronyms. bc
and ad
should be aliases because they are also commonly used. Other names like anno-domini
or common-era
should not be used because those names are rarely used, compared to their acronyms. chinese
and iso8601
don't use eras in common usage. For those calendars, the canonical forward-counting era identifier should be the BCP-47 ID of the canonical calendar for that era.before-
prepended to the canonical identifier of the later era.I gave every CLDR calendar gets an identity era. I included iso8601 because it was in the CLDR table, but we could say that iso8601 is the exception. Actually we could say that cyclic calendars are an exception, too.
This seems like a reasonable approach to me. It certainly makes things more consistent across calendars which seems like a good thing. If we are going to do this, then a normative PR in March seems like a good idea to leverage this consistency in the 262 spec. FYI @ptomato.
2. The switchover date is a field specified in the calendar constructor, and I wouldn't be opposed to putting it in the calendar ID, like
julian-gregory-16500101
for a hypothetical 1650-01-01 switchover date
Would there be a fixed list of dates supported? If not, then how would enumeration of calendars work?
For a julian-gregory
calendar, I'd strongly suggest it be fully spec-ed out before this proposal is finalized, because it seems to behave differently from all other CLDR calendars and runs the risk of breaking invariants we might rely on elsewhere.
I think there won't be
julian
because it is not in CLDR; there iscoptic
instead.
Given that we fairly frequently hear requests for a julian calendar, I wonder if julian
should be an alias for coptic
?
Would there be a fixed list of dates supported? If not, then how would enumeration of calendars work?
For a
julian-gregory
calendar, I'd strongly suggest it be fully spec-ed out before this proposal is finalized, because it seems to behave differently from all other CLDR calendars and runs the risk of breaking invariants we might rely on elsewhere.
I'm pretty sure the scheme here covers all potential designs of such a calendar. You can have a fully flexible julian-gregory calendar where the switchover is set at runtime or from locale data, and still work in this scheme where you basically choose to return a date in the julian
or gregory
era based on context. The only thing that changes is the canonical output era for a given date.
Given that we fairly frequently hear requests for a julian calendar, I wonder if
julian
should be an alias forcoptic
?
Absolutely not: the Julian calendar is completely different, and the Julian epoch is not the Coptic epoch. The calendars have the same period for the year, but they do not share a notion of months or era epochs, nor do the years start on the same day -- they are not fully synchronized/aligned
Absolutely not: the Julian calendar is completely different, and the Julian epoch is not the Coptic epoch.
Makes sense; I'm unfamilar with Julian so didn't know. In that case then @sffc could you explain your comment above: "I think there won't be julian
because it is not in CLDR; there is coptic
instead." ? I assumed you meant that Julian was equivalent to Coptic, but now I'm not sure what you meant. :-)
I'm pretty sure the scheme here covers all potential designs of such a calendar.
I was thinking a bit more broadly: given that all other calendars have static identifiers and static eras, it'd be understandable that we'd assume those invariants. But this calendar (depending on its design) would break those assumptions. Seems like it'd make sense to do more work on the design of such a calendar in order to determine answers to questions like:
I meant that Julian is not widely used these days but Coptic is, and I think some people people who say they want Julian may actually want Coptic (no citation for that claim).
Makes sense, thanks for clarifying. FWIW, in the issues and comments filed in the Temporal repo, the only currently-unsupported calendar that's come up a lot has been Julian. Obviously a non-random sample of GitHub-commenting calendar enthusiasts isn't enough to drive the roadmap, but it's a data point that suggests there may be interest in that calendar.
Another interesting thing: Java's GregorianCalendar class is actually a Julian/Gregorian hybrid:
GregorianCalendar
is a hybrid calendar that supports both the Julian and Gregorian calendar systems with the support of a single discontinuity, which corresponds by default to the Gregorian date when the Gregorian calendar was instituted (October 15, 1582 in some countries, later in others). The cutover date may be changed by the caller by callingsetGregorianChange()
.Historically, in those countries which adopted the Gregorian calendar first, October 4, 1582 (Julian) was thus followed by October 15, 1582 (Gregorian). This calendar models this correctly. Before the Gregorian cutover,
GregorianCalendar
implements the Julian calendar. The only difference between the Gregorian and the Julian calendar is the leap year rule. The Julian calendar specifies leap years every four years, whereas the Gregorian calendar omits century years which are not divisible by 400.
GregorianCalendar
implements proleptic Gregorian and Julian calendars. That is, dates are computed by extrapolating the current rules indefinitely far backward and forward in time. As a result,GregorianCalendar
may be used for all years to generate meaningful and consistent results. However, dates obtained usingGregorianCalendar
are historically accurate only from March 1, 4 AD onward, when modern Julian calendar rules were adopted. Before this date, leap year rules were applied irregularly, and before 45 BC the Julian calendar did not even exist.Prior to the institution of the Gregorian calendar, New Year's Day was March 25. To avoid confusion, this calendar always uses January 1. A manual adjustment may be made if desired for dates that are prior to the Gregorian changeover and which fall between January 1 and March 24.
@justingrant @Manishearth and I discussed the issues in this thread. We reached alignment on most issues, with the following changes to what was stated above:
Good meeting. Thanks for taking the time.
- ECMAScript could return different canonical era codes than those in CLDR, so long as they are aliases
I'll open a separate issue to add this as a special case for Gregorian.
Bikeshedding for the Proleptic Gregorian BCE era name:
pre-gregory
(current proposal)backward-gregory
back-gregory
before-gregory
gregory-anterior
ante-gregory
gregory-bce
(note: this would result in roc-bce
and coptic-bce
; is that okay?)I kind-of like ante-gregory
. It's a more academic-sounding term (originating from Latin) that may be less likely to be misinterpreted to mean Julian than pre-gregory
.
I'll express a strong preference against names like "backward-gregory" that focus on its direction: I think it's useful that the direction correlates with the name, but I do not think that is the most important thing about the era, and it will be confusing.
I support pre-gregory
or ante-gregory
, preference for pre-
. Also fine with prev-gregory
or gregory-prev
gregory-bce
/roc-bce
actually kind of makes sense since each calendar has a "common era", the problem is that "common era" is both a specific era ("Common Era", proper noun) and a generic one ("common era", noun-adjective phrase)
In a meeting between Shane, Justin, and I, we discussed this a bit:
For a
julian-gregory
calendar, I'd strongly suggest it be fully spec-ed out before this proposal is finalized, because it seems to behave differently from all other CLDR calendars and runs the risk of breaking invariants we might rely on elsewhere.
Basically, we do have a menu of potential designs for julian-gregory, that vary along a couple axes. So far none of the era designs are particularly incompatible with this.
We can essentially either have a single julian-gregory
calendar that takes options on construction (explicit switchover date, perhaps guessed switchover date from locale, perhaps with some default), or a list of julian-gregory-norway
etc calendars (alternatively named things like julian-gregory-02091752
).
A tricky thing is that "switchover" is not necessarily a singular concept of needing just a pivot date here: Britaindid something a bit fancy here, since they considered years to start in March, where they had a multi-phase switchover moving the year numbering over first. However, it is typical to cite British Julian dates as backdated to January, and historical documents from this era tend to deal with "old style" and "new style" dates.
The julian-gregory
scheme loses a useful property: previously, calendars only needed the code for identity. It is possible for a data-driven Islamic calendar to be different from one with the same code but loaded from different data, but that's not possible in Temporal or ICU (only in ICU4X's model, and we don't support the islamic calendars yet. We plan to consider this type of mismatch situation API misuse and have garbage-in-garbage-out behavior)
julian-gregory-foobar
is more verbose but matches the way we do Islamic calendars.
Neither changes this era scheme too much, though in the julian-gregory-foobar
scheme we may wish to add the julian-gregory
and pre-julian-gregory
era to all calendars, if we decide to have one (See below)
One design for a switchover calendar is to have it return dates in era julian
for pre-switchover and era gregorian
for post-switchover (and pre-julian
for bce
dates). This works fine and clearly communicates information about the switchover.
Another design is to have a single combined julian-gregory
era (also aliased to ce
/ad
), which is just one such that there are a bunch of missing days. So dates will always be returned in julian-gregory
or pre-julian-gregory
(or ce
/bce
, or whatever, depending on decisions made here). We can still accept dates in julian
or gregory
, even if out-of-range (for out-of-range-of-era cases we've already decided that we should take reasonable fallback where possible, i.e. -10 BC
in the Gregorian calendar will do the sensible thing, depending on error settings).
Since all output eras must be inputable, this brings up the question of what happens on input of julian-gregory
. Listed as a separate axis since we may still decide to allow such input even if we produce a split era.
If we decide to have a julian-gregory
(or ce
) "combined" era, we need to determine if we allow it on input, and what it's behavior will be.
We're able to do this because the switchover is from Julian to Gregorian, not vice versa, which means there are no overlapping dates, only a gap (except for British "old style" vs "new style" year reckoning, which we shouldn't handle in the base calendar anyway). So we can just look at dates, check if they're before or after the switchover, and set the internal era appropriately. If they're during the switchover, we can either return an error or perhaps clamp, though I'd like the default to be erroring here, since there's not a singular obvious choice for how the clamping should work. But we may want to clamp in overflow: constrain
mode.
We probably will have to have the magical combined era as one of the listed eras so that julian-gregory
is an era that makes sense in this scheme. I don't think that's a huge deal. There's sensible behavior for it.
Though @sffc if we want to return julian
and gregory
as canonical returned eras, we may lose the property that the calendar
/pre-calendar
eras are always the canonical ones.
Though @sffc if we want to return julian and gregory as canonical returned eras, we may lose the property that the calendar/pre-calendar eras are always the canonical ones.
There are now 2 places where it is harmful to name the anchor era the same as the calendar: japanese
and julian-gregory
. We could add an anchor="true"
attribute instead.
In this case what do you mean by "anchor"?
I think we have a couple concepts floating around:
year
?I see. The property that the canonical era is calendar
/pre-calendar
is only that calendar
should be any calendar, not necessarily the current calendar. For example, ethioaa
is a canonical era in the ethiopic
calendar.
The "anchor era" is an additional property needed by Temporal for how to resolve inputs such as { calendar: "foo", year: 1234 }
. It's the era to use when the implicit "arithmetic era" is needed.
Gotcha, what I thought.
@michaelficarra suggested adding Kanji aliases for Japanese era names in TC39-TG1. I opened an upstream CLDR issue to discuss further:
@sffc There's also the precomposed characters for era names, like ㋿. I don't know which is a more appropriate alias, or if they are both equally valid.
Given the complexity of the moveable switchover date, I wonder if julian-gregory
would be better handled as a custom calendar rather than a built-in one? For example, I could imagine an npm package that includes all the logic, and users of that package would simply call a constructor with the switchover date as a parameter, and they'd get a custom calendar class that they could use. A package for custom Islamic calendars could work similarly.
This isn't ideal because string parsing like Temporal.PlainDate.from('2022-01-30[u-ca=julian-gregory-02091752]')
wouldn't work. But that's a general problem for all custom calendars, which I expect will be solved by a "Custom Calendar Helper" polyfill that will patch Temporal to allow Temporal.*.from
and other methods so they'll work with custom calendars.
BTW, the julian-gregory
calendar is complicated enough, maybe it should be moved into its own issue for discussion?
The property that the canonical era is
calendar
/pre-calendar
is only thatcalendar
should be any calendar, not necessarily the current calendar. For example,ethioaa
is a canonical era in theethiopic
calendar.
Good to re-use era names where they are shared between calendars (like the Ethiopian case) but I'm not sure I understand the benefit of "any calendar". Why would you want to use an ethioaa
era in an islamic
calendar?
There are now 2 places where it is harmful to name the anchor era the same as the calendar:
japanese
andjulian-gregory
.
By "anchor era" did you mean "canonical name of the anchor era"?
We could add an
anchor="true"
attribute instead.
Attribute in the CLDR data? Or somewhere else?
I would hope that Temporal is never patched by anything except a 262 and/or 402 compliant polyfill - custom calendars shouldn’t be injected into globals.
I would hope that Temporal is never patched by anything except a 262 and/or 402 compliant polyfill - custom calendars shouldn’t be injected into globals.
Given the limitations of the current API, if a custom calendar or timezone author wants to allow string parsing to work, e.g. Temporal.PlainDate.from('2020-01-01[u-ca=mycalendar')
then there's no other way to support that without patching Temporal. This is a fallout from the change made a while ago to stop calling observable from
when parsing string input.
I'm not saying that patching is a great idea, only that if you're a custom cal/tz author then I'm not sure how you can make your code act like a built-in timezone or calendar without patching.
BTW, the
julian-gregory
calendar is complicated enough, maybe it should be moved into its own issue for discussion?
I mean, the only reason we're talking about it here is that we wanted to make sure we're not closing the door to future designs (which we've established), we don't need to design it yet.
My proposal is scattered between different threads, so I thought I'd try to summarize it here.
There are 2 types of calendars with eras: those with a variable number of eras, and those with a fixed number. Japanese is the one with a variable number of eras, and Manish has already written a nice spec for that. Therefore, in the rest of this post, I am focused on calendars with a fixed number of eras.
Eras should define two things:
eraYear is permitted to be negative.
Eras should be named as follows:
pre-
followed by the BCP-47 IDWith these rules, we fully define the era codes for all CLDR calendars.
buddhist
pre-coptic
,coptic
ethioaa
ethioaa
,ethiopic
*2pre-gregory
,gregory
pre-julian
,julian
,gregory
*3hebrew
indian
islamic
islamic-civil
islamic-rgsa
islamic-tbla
islamic-umalqura
iso8601
*4persian
roc
*1 In these cyclic calendars, there is no clear epoch. We need to either choose an epoch or define the era/eraYear in terms of ISO-8601 / Gregorian.
*2 Dates in the incarnation era are expressed with ethiopic; dates prior to that are expressed in ethioaa. Dates prior to the creation of the earth are expressed as negative numbers in ethioaa.
*3 Note that input dates can be labeled as either
gregory
orjulian
and they will be interpreted correctly, even if the change date is unknown by the caller.*4 This is the same era as
gregory
; unclear if it needs to be re-defined.*5 See the doc linked at the top of this post for how to handle Japanese.