tc39 / proposal-temporal

Provides standard objects and functions for working with dates and times.
https://tc39.es/proposal-temporal/docs/
Other
3.35k stars 153 forks source link

Point to ISO standard for Date string syntax #198

Closed Ms2ger closed 4 years ago

Ms2ger commented 5 years ago

While it's more doable to define this format right here for Date as opposed to Duration, should we point this to ISO 8601/RFC 3339 for the sake of consistency?

Originally posted by @ryzokuken in https://github.com/tc39/proposal-temporal/pull/194/files

pipobscure commented 5 years ago

We support ISO-8601 with the following stipulations:

littledan commented 5 years ago

BTW I'd suggest not pointing to outdated, historical JavaScript standards, and instead pointing to the current one: Expanded years.

Why do we support these in Temporal?

pipobscure commented 5 years ago

Because we really like dates to be possible beyond the year 9999 ? Because the other option is to come up with a different standard; and we need something; and by the year 999999 (at the rate we are going) there won't be any humans to program computers, so it's a perfectly valid limit. 😄

littledan commented 5 years ago

OK, so Temporal will support even more than Date does in that linked section: The full six digit limit, not just the things in Date.parse range. Is that right? I guess we'll have to write a new definition or reword the existing one, if we want to go for this flexibility (as opposed to just sticking with the four digit limit).

kaizhu256 commented 5 years ago

can you give javascript-scenarios where we need [utc-accurate] years beyond +/-9999? because I honestly cannot think of one.

unless there's notable non-gregorian calendars requiring 4+ digit years in Intl, I prefer limiting the scope to 4-digits, to avoid programming-bugs in the common-scenario of sorting ISOStrings in JavaScript (and wasm-sqlite3).

pipobscure commented 5 years ago

It doesn’t just go for dates > 9999-12-31 but also for dates smaller 1000-01-01 because in that case the pattern changes as well.

+000999-12-31 would be the day before. So the simple case would be if you are dealing with history.

pipobscure commented 5 years ago

@littledan given that the Date range is +-1/4 million years (roughly) I’d be fine sticking to that range. That way we maintain compatibility and given that both people of the past and future (beyond that range) will likely not have a calendar based on the supposed birth of a deity the usefulness is in doubt. Especially since going outside that range generally involves other changes as well.

ljharb commented 5 years ago

The Morlocks live in the year 802,701 :-p

littledan commented 5 years ago

I'd imagine that we would continue to support bigger values in the constructor and from methods, and that we're talking narrowly about the grammar here, right? With everything on this thread, I am increasingly convinced we should not support extended years. OTOH the rest of the stipulations SGTM, e.g., nanosecond precision and timezone syntax seem pretty mandatory.

kaizhu256 commented 5 years ago

The Morlocks live in the year 802,701 :-p

and the Earth is (approx) 4 billion years old, but who actually cares about [Temporal] utc-accuracy of these dates or anything past +/-9999? most scenarios for such timescales only care about precision to +/-1 year. u don't need overengineered Temporals for that -- basic math-arithmetic is usually good-enough and more cost-effective.

as for years with 3 or less digits, the correct way is to leftpad zeros when in ISOstring-form to coerce it to 4-digits for common-case string sorting/comparing.

pipobscure commented 5 years ago

@littledan only if we accept that we can produce datetimes/absolutes that cannot be serialized or deserialized. (Which in my mind is a no-go)

ljharb commented 5 years ago

@kaizhu256 i think there’s a number of geological websites and astronomical websites that would love to be able to represent eons instead of just a handful of millennia.

Why would we want to artificially restrict ourselves?

littledan commented 5 years ago

@pipobscure Interesting; why is this a requirement? Don't we have the same issue with a six-digit limit?

littledan commented 5 years ago

A separate aspect is the variation in punctuation we allow. I believe @gibson042 investigated this in some depth for Date.parse, and now #229 permits a bit more variation. What do we want to permit in Temporal exactly? (I'd suggest we think this through before landing #229.)

pipobscure commented 5 years ago

@littledan the space instead of T thing is an addition of RFC3339 which bases itself on ISO8601. So by just allowing that, we have much better compatibility. I figured that would be worth it, since for the most part RFC3339 just specifies an ISO8601 profile like ECMA does for JS. Both are frequent use-cases.

littledan commented 5 years ago

Cool, I'm not opposed to that particular change, but I just wanted to raise this because @gibson042 's presentation included several other syntactic variants and I don't know whether we want to include those.

gibson042 commented 5 years ago

If Temporal admits even one alternate spelling of a value with identical precision, then it should admit all standardized alternate spellings from ISO 8601. This includes arbitrary-case alphabetic designators and . or , as decimal sign, such that e.g. 1955-11-13T06:04:00.9Z is equivalent to 1955-11-13t06:04:00,9z (and also 1955-W45-7t06:04:00,9Z, if Temporal includes deserialization of week dates).

littledan commented 5 years ago

@gibson042 Do you have any thoughts about the plan above to allow six digit years?

gibson042 commented 5 years ago

My priorities are something like this:

  1. Use the same range for all overlapping Temporal types (i.e., Absolute, Date, DateTime, and YearMonth).
  2. Don't exceed the bounds of the existing ECMAScript date-time string interchange format (which specifies six digits for expanded years).
  3. Align with the existing ECMAScript Date range of POSIX epoch ± 1e8 days, +275760-09-13T00:00Z to -271821-04-20T00:00Z (https://github.com/tc39/proposal-temporal/issues/24#issuecomment-530868024 ).

I hold the third priority only weakly and would be willing to let it go, but not without an explicit decision regarding the resulting edge cases such as Temporal.Absolute.from("+999999-12-31T18:00Z").inTimeZone("+10:00") (a date and time of day in year 1000000) and Temporal.DateTime.from("+999999-12-31T18:00").inTimeZone("-08:00") (an instant in year 1000000)—my preference for both is throwing a RangeError.

ptomato commented 4 years ago

I've been looking into this a bit, pursuant to #312.

We support ISO-8601 with the following stipulations:

  • Only the Calendar-Date format is supported for dates/date-times

What's the motivation not to support weekdates and ordinal dates (2020-W04-1 and 2020-020 respectively?) Weekdates could be argued to add complexity since there is a bit of calculation required around when week 1 starts, but ordinal dates would seem fairly trivial to support.

Agreed; was there any consensus on supporting P7W for "seven weeks" as it does seem to be part of the "simple" format?

  • In durations only seconds may have fractional parts.
  • The timezone designator may be extended by [<IANA>] zones to properly designate timezones
  • Dates/Times/TimeZones may occur individually or in combination

I think this rule introduces some complications, which we are talking about in #313 — allowing time zones by themselves would make Z and -08:00[America/Vancouver] legal ISO 8601 strings, which seems surprising for a number of people. The ISO 8601 grammar in RFC 3339 treats a time zone as an optional addition to a time, and so does the standard itself as far as I can tell from its description in Wikipedia. I think we should stick to the standard here and speak of date representations, time representations (which may contain a time zone), and combined date/time representations, not allowing time zones by themselves. We can allow lone time zones separately in TimeZone.from() but it seems to me that we should not call them legal ISO strings.

  • We agree to nanosecond precision meaning seconds may have 0, 3, 6 or 9 decimal places

I think this should read "0 through 9" so that 0.5 means 500 ms, it would be surprising if you were required to input 0.500. Would we make any further decimal places beyond 9 illegal or simply truncate them?

ptomato commented 4 years ago

Meeting Jan. 27: We will not support weekdates, ordinal dates, or week durations at this time. We will also not support time zone parts in an ISO string without an accompanying time or datetime. For serializing, we'll output only 0, 3, 6, or 9 decimal places, but we'll accept 0 through 9 when parsing. For more than 9 we'll throw.

That makes the updated list of stipulations:

kaizhu256 commented 4 years ago

Seconds may have 0, 3, 6 or 9 decimal places in serialized strings

that makes sorting isostrings problematic as pointed out in issue #329. can the user specify a truncation/padding length when serializing?

i honestly see little value from microsecond/nanosecond precision (for all the trouble it creates). problems requiring that level of precision are generally out-of-scope of this proposal, and don't care about timespans >24h or calendar dates.

ptomato commented 4 years ago

that makes sorting isostrings problematic as pointed out in issue #329. can the user specify a truncation/padding length when serializing?

I guess that's a question for #329...

ptomato commented 4 years ago

The deeper I look into this, the more stipulations I find that we have to add... here is my current list.

By the way I'm also aware of the following differences between RFC 3339 and ISO 8601:

gibson042 commented 4 years ago
  • Mixtures of basic (no punctuation) and extended (with punctuation) expressions are not permitted in ISO 8601, but they are permitted by the grammar in RFC 3339.

Are you sure that ISO 8601 prohibits mixtures? The authors of RFC 3339 weren't ("ISO 8601 is not clear if mixtures of basic and extended format are permissible. This [attempt to create a formal grammar from ISO 8601] permits mixtures.").

Following RFC 3339 would be the most flexible way, but then we'd have to accept a lot of things that look confusing and not much like dates... e.g. 3446-0508T03:2815-0630 meaning 3:28:15 AM, May 8, 3446, in a time zone that's 6:30 before UTC

I don't think it's quite that bad. Per ISO 8601, the basic format for complete calendar date has no punctuation while the extended format has two mandatory dashes, the basic format for complete time of day has no separating punctuation while the extended format has two mandatory colons, and the basic format for complete UTC offset has no separating punctuation while the extended format has one mandatory colon. So the worst case for Temporal parsing is more like 34460508T03:28:15-0630 (which is admittedly still pretty bad).

The fractional part of a second is required to be preceded by "00" in ISO 8601, but not in the grammar in RFC 3339.

I don't know what you mean if not ISO 8601 "a decimal fraction of hour, minute or second may be included", which is already covered by your "Only seconds are allowed to have a fractional part" bullet point.

A time zone offset of "-00:00" is allowed in RFC 3339, but not in ISO 8601.

  • I recommend we stick to RFC 3339 here since that would be the most flexible.

Agreed.

ptomato commented 4 years ago

Are you sure that ISO 8601 prohibits mixtures? The authors of RFC 3339 weren't

They later released an erratum clarifying that ISO 8601 does prohibit mixtures.

I don't think it's quite that bad. Per ISO 8601, the basic format for complete calendar date has no punctuation while the extended format has two mandatory dashes, the basic format for complete time of day has no separating punctuation while the extended format has two mandatory colons, and the basic format for complete UTC offset has no separating punctuation while the extended format has one mandatory colon. So the worst case for Temporal parsing is more like 34460508T03:28:15-0630 (which is admittedly still pretty bad).

What I meant was, the RFC 3339 grammar does permit each punctuation mark to be present or absent individually, so a mess like 3446-0508T03:2815-0630 could indeed be generated from that grammar.

I don't know what you mean if not ISO 8601 "a decimal fraction of hour, minute or second may be included", which is already covered by your "Only seconds are allowed to have a fractional part" bullet point.

Sorry, I'll try to put it in a different way; RFC 3339 would allow 17:45.22 for "quarter to 6 plus 220 milliseconds", whereas ISO 8601 would require 17:45:00.22. The former would be ambiguous except that RFC 3339 also doesn't allow fractional parts elsewhere than seconds. Certainly we should never emit 17:45.22 but should we accept it? My feeling is no, because it looks like it could be a typo for 17:45:22.

gibson042 commented 4 years ago

Are you sure that ISO 8601 prohibits mixtures? The authors of RFC 3339 weren't

They later released an erratum clarifying that ISO 8601 does prohibit mixtures.

:+1:

What I meant was, the RFC 3339 grammar does permit each punctuation mark to be present or absent individually, so a mess like 3446-0508T03:2815-0630 could indeed be generated from that grammar.

Right, but only in the Appendix A attempted formal grammar for ISO 8601. I think that should be considered too loose for Temporal, which should at least require that every date, time, and UTC offset are either completely basic or completely extended. It could also go further and reject strings that mix basic and extended format across date, time, and UTC offset, but I'm not sure if should (I suspect that extended format date and time of day in combination with basic format offset, as in "2020-02-14T22:09-0500", are not that uncommon).

Sorry, I'll try to put it in a different way; RFC 3339 would allow 17:45.22 for "quarter to 6 plus 220 milliseconds", whereas ISO 8601 would require 17:45:00.22. The former would be ambiguous except that RFC 3339 also doesn't allow fractional parts elsewhere than seconds. Certainly we should never emit 17:45.22 but should we accept it? My feeling is no, because it looks like it could be a typo for 17:45:22.

I'm still having a hard time following you. The RFC 3339 Internet Date/Time Format requires seconds and permits fractions only after seconds, so "…T17:45.22" is not valid. The RFC 3339 ISO 8601 formal grammar permits fractions after hours, minutes, or seconds, but doesn't specify their semantics (so "…T17:45.22" is valid but presumably interpreted per ISO 8601). And ISO 8601 permits decimal fractions after hours, minutes, or seconds, so "…T17:45.22" is valid and interpreted as "quarter to 6 plus 0.22 minutes", equivalent to "17:45:13.2"). There's no ambiguity that I can see, only a decision about whether or not to accept the fractional hours or minutes that are permitted by ISO 8601 but not by RFC 3339. Personally, I would place them in the same "advanced usage" bucket as ordinal dates and week dates, and either accept them all with ISO 8601 semantics, or reject them all as too deviant from the time elements appearing in RFC 3339.

ptomato commented 4 years ago

I think that should be considered too loose for Temporal, which should at least require that every date, time, and UTC offset are either completely basic or completely extended.

:+1:

Personally, I would place them in the same "advanced usage" bucket as ordinal dates and week dates, and either accept them all with ISO 8601 semantics, or reject them all as too deviant from the time elements appearing in RFC 3339.

OK, I get you now, I was reading RFC 3339 incorrectly. I was assuming that because RFC 3339 only permits seconds fractions, that implied the semantics of T17:45.22 would be "quarter to 6 plus 0 seconds and 220 ms", but on second reading I think you're right that no particular semantics are implied by the grammar. I think we should stick to our earlier determination that only seconds fractions are allowed and everything else is too advanced.