tc39 / proposal-temporal

Provides standard objects and functions for working with dates and times.
https://tc39.es/proposal-temporal/docs/
Other
3.35k stars 153 forks source link

Extended ISO string representation if time zone is an offset (not an IANA name) #703

Closed justingrant closed 4 years ago

justingrant commented 4 years ago

From #700 open issues:

What extended ISO string representation should be used when the time zone is an offset, not an IANA name? Should the offset be in brackets?

Example:

Temporal.LocalDateTime.from({ absolute: '2020-03-08T09:30Z', timeZone: '+07:00' }).toString();
// => "2020-03-08T16:30+07:00[+07:00]"
// but should it be this? => "2020-03-08T16:30+07:00"

The argument to repeat the offset in brackets is to prevent LocalDateTime from parsing ISO strings that lack a time zone identifier, in order to prevent implicitly treating a timezone-less value as if it had a time zone. The whole idea behind LocalDateTime is that time zones should never be implicit (with now.localDateTime as the only exception) because implicitly assuming a time zone is the path to timezone issues and DST bugs like we get with legacy Date.

But what's the argument on the other side to avoid emitting the duplicated offset in brackets? And if we did that, then should LocalDateTime still forbid parsing of bracket-less ISO strings?

ptomato commented 4 years ago

What we currently do for the time zone part of an ISO string, is treat -04:00 as an offset time zone (no DST changes) and -04:00[America/New_York], as an IANA time zone (which may or may not have rules for DST changes). Could LocalDateTime refuse to parse, say, 2020-06-25T15:20 but accept both 2020-06-25T15:20-04:00 and 2020-06-25T15:20-04:00[America/New_York] as meaning two different things?

justingrant commented 4 years ago

After doing more research on this topic today, I now believe that LocalDateTime.prototype.from should not accept bracket-less ISO strings. Here's why: there are at least two major platforms (.NET and SQL Server) have awareness of offsets but not time zones. (I think @sffc mentioned this in a previous issue.) Anyway, on these platforms, offset-aware strings will likely be persisted like this: 2020-06-25T15:20-04:00.

So a reasonable user who's not familiar with DST might assume that parsing one of those persisted-by-another-platform strings is good enough to use in Temporal. But that leads to buggy code which will have a local time that's off by one hour, like this:

LocalDateTime.from('2020-06-25T15:20-04:00').plus({months: 6});

So I think it makes sense to put a roadblock in front of those users so that they at least get an exception to help them figure out that what they really need is this.

LocalDateTime.from({absolute: '2020-06-25T15:20-04:00', timeZone: 'America/New_York'}).plus({months: 6});

There will be cases where users really mean that the time zone should be -04:00', but based on what I've learned about persist-offset-but-not-time-zones platforms, I think that the "I really want -04:00'" case is rare but the "I got this from my database or web service so it's good, right?" case will be common. So I think we should mandate the redundant offset in brackets to force the user to opt in if that's what they really intend.

What do you think?

sffc commented 4 years ago

@justingrant's points make sense to me.

If you have a string without an IANA time zone, you can still get an Absolute for it.

ptomato commented 4 years ago

I think it would be very weird if LocalDateTime could only accept an ISO string if it included an unofficial extension.

sffc commented 4 years ago

I think it would be very weird if LocalDateTime could only accept an ISO string if it included an unofficial extension.

I assume it would accept the Z time zone, which is in the spec, but just not an offset timezone without the corresponding IANA name. @justingrant is that correct?

justingrant commented 4 years ago

This was very useful feedback that prompted me to look at Temporal parsing more broadly with an eye towards preventing the (unfortunately common) error of parsing a string into the "wrong" Temporal type. For example, I suspect that the code below is very likely (almost always?) a bug:

Temporal.DateTime.from("2020-06-30T18:56:32.260Z")

I think this code should throw. The string is explicitly declaring that it's an absolute string so the chance that the user really intends it to be a DateTime is questionable at best. In the (unusual, I suspect) case that users really intend this to be a UTC DateTime, then they can either append [UTC] to the string or (better, IMHO) parse it into an Absolute and then use .toDateTime('UTC'). But the basic idea is that this pattern is a likely bug, so it's worth adding an opt-in barrier to prevent accidental use.

Error messages could help developers understand how it works. If an ISO string doesn't parse, we could try to parse it using other formats and if any of them match then we could tune the error message accordingly. For example:

Temporal.DateTime.from("2020-06-30T18:56:32.260Z")
// => throws new RangeError(
// `Cannot parse \`Temporal.DateTime\` from '${s}'. For absolute date/time parsing, use \'Absolute.prototype.from()\''
// );

Temporal.Time.from("P15D")
// => throws new RangeError(
// `Cannot parse \`Temporal.Time\` from '${s}'. For duration parsing, use \'Duration.prototype.from()\''
// );

BTW, even without LocalDateTime, in the current Temporal developers will need to learn which ISO string formats correspond to which Temporal types. LocalDateTime would add to that list, but the list is already there:

It may be worth a page in the docs (e.g. "Parsing Temporal Objects from ISO 8601 Strings") that turns the list above into an easier-to-understand format like a table with examples. Also the page should include an explanation of our extensions for calendars and time zones. I'd volunteer to write a page like this if you think it'd be helpful.

What do you think?

I ask because I went through the same process to determine the current parsing behavior of LocalDateTime:

  1. Figure out likely error cases, especially where the non-buggy case is less common than the buggy one
  2. Throw exceptions in those cases with helpful messages
  3. Allow an alternate opt-in way to do the task without an exception, and document that way.

Anyway, with that context in mind...

@ptomato I think it would be very weird if LocalDateTime could only accept an ISO string if it included an unofficial extension.

A fundamental challenge we have is that ISO 8601 knows about offsets (including Z) but not time zones. If LocalDateTime is fundamentally about timezone-aware data, I don't see how LocalDateTime could accept a standard ISO 8601 string because a standard ISO string can never encode a time zone.

That said, Z strings are definitely something that users will try, so it probably deserves a clarifying error message, per discussion above. Here's what's currently used for object initializers. We could do something similar for string initializers too:

if (item instanceof Temporal.Absolute) {
  throw new TypeError('Time zone is missing. Try `{absolute, timeZone}`.');
}

@sffc I assume it would accept the Z time zone, which is in the spec, but just not an offset timezone without the corresponding IANA name. @justingrant is that correct?

Nope. Per discussion above, to avoid likely error cases my intent was that callers explicitly opted in to provide a time zone, either by using an Extended ISO string, or by explicitly specifying the time zone to force the caller to confirm that they're really intending to use UTC or any other offset-based "time zone". Like this:

Temporal.LocalDateTime.from({
  absolute: Temporal.Absolute.from('2020-06-30T18:56:32.260Z'),
  timeZone: 'UTC'
});
// OR 
Temporal.Absolute.from('2020-06-30T18:56:32.260Z').toLocalDateTime('UTC');

To me this seems similar to the case where Temporal.DateTime.from("18:56:32.260") throws. There are certainly valid use cases where developers might expect it to not throw and instead default the date to 1970-01-01, but this would be a bug most of the time so we don't allow it. LocalDateTime requires a time zone, but "Z" isn't really a "time zone"-- it's more of an offset, so I figured it made sense to treat it the same as other offsets which is that parsing isn't allowed without brackets.

sffc commented 4 years ago

On Z: I see Z == UTC as being the only real time zone supported by ISO 8601, and I think Z counts as a clear enough intent. UTC has no daylight transitions and is common in computing. "Z" can be seen as different than "+00:00", which could correspond to Europe/London in the winter or Atlantic/Azores in the summer.

On explicit time zones in general: This is consistent with Temporal design principles. \<rant> My hope for the last few quarters has been that we apply the same "explicit is better" mentality that we have for time zones to calendars. Time zones and calendars are equals in the data model, so why not in the API? (Main issue: #292) \</rant>

ptomato commented 4 years ago

I'm not sure I agree that being explicit here is better either, so I don't feel I'm being inconsistent :smile: :shrug:

niklasR commented 4 years ago

All the points around parsing with the .from() make sense - For Absolute, stick to ISO 8601, but I can see how flexibility in parsing it is helpful for the DateTime, as long as the edge cases and variants are defined and tested. Maybe a "strict" mode that throws if it's not exactly what's expected could be helpful for some? Though I don't know what TC39 think about that.

What I am more concerned about though, is the toString() output. The docs says that it "returns the date in ISO 8601 date format", for both DateTime and Absolute, and as far as I can see ISO 8601 does not include IANA names at all. If a timezone is specified in the toString(), the offset takes that into account, and surely we should stick with the standard?

thojanssens commented 4 years ago

I don't see how LocalDateTime could accept a standard ISO 8601 string

I didn't look into your LocalDateTime proposal yet, but it can convert all given ISO strings to UTC. See in Elixir: https://hexdocs.pm/elixir/DateTime.html#from_iso8601/2

Since ISO 8601 does not include the proper time zone, the given string will be converted to UTC and its offset in seconds will be returned as part of this function. Therefore offset information must be present in the string.

What I am more concerned about though, is the toString() output. The docs says that it "returns the date in ISO 8601 date format", for both DateTime and Absolute, and as far as I can see ISO 8601 does not include IANA names at all.

We should stick to ISO for output at least. I expect the following:

Temporal.now.absolute().toString()

"2020-07-09T03:23:19.986994478Z"

Temporal.now.absolute().toString('-0700')

"2020-07-08T20:23:14.478676279-07:00"

Temporal.now.absolute().toString('America/Los_Angeles')

"2020-07-08T20:24:52.496999986-07:00"

And if we really want to output a non-ISO string (which I think we shouldn't because of my arguments in the other issues, see below)

Temporal.now.absolute().toString('America/Los_Angeles', { timeZone: true })

"2020-07-08T20:24:52.496999986-07:00[America/Los_Angeles]"

With the current API, to output an ISO string from an absolute and a time zone, we have to code:

const absolute = Temporal.now.absolute()
absolute.toString(Temporal.TimeZone.from('America/Los_Angeles').getOffsetStringFor(absolute))

It then looks that we favor the extended format (non-ISO) rather than ISO.

The issue below demonstrates why offset and time zone together in a datetime string leads to problems, to more code to handle those problems, etc. See my arguments why we shouldn't even accept any extended format: https://github.com/tc39/proposal-temporal/issues/716#issuecomment-655870391 It requires an open mind because I know we are used to see the time zone in brackets for a long time.

Another related issue: https://github.com/tc39/proposal-temporal/issues/741

justingrant commented 4 years ago

@niklasR @thojanssens - I copied your feedback into #741. That issue is focused on exactly the topic you're concerned about above, which is whether the time zone time zone is included in the output of Absolute.prototype.toString().

ptomato commented 4 years ago

Here are my opinions on this topic:

sffc commented 4 years ago

I see this thread as circular. On the one hand, it is bug-prone for LocalDateTime to parse an offset time zone. On the other hand, if a LocalDateTime is explicitly built with an offset time zone, we need a way to serialize it in .toString(). We have to either:

  1. Accept a more complicated ISO string syntax (e.g., with the bracketed offset).
  2. Accept the fact that LocalDateTime.parse can be bug-prone.
  3. Programatically prevent LocalDateTime from ever existing with an offset datetime, e.g., by splitting TimeZone into two types, one for IANA time zones and the other for offset time zones.

in which case the obvious thing to do is LocalDateTime.from(isoString).with({ timeZone: timeZoneName })

In that case, you have two choices that are compatible with the proposal on hand: Absolute.from, or DateTime.from. Pick the one that corresponds to whether you want the offset to win or the datetime to win. Since you usually want the absolute to win, you end up doing:

Temporal.Absolute.from(isoString).toLocalDateTime(timeZoneName, "iso")

Actually, I think starting your original snippet, starting with LocalDateTime.from, is fundamentally flawed: since the offset time zone and IANA time zone are known at different points, you can't properly resolve the "should offset or datetime win?" question.

ptomato commented 4 years ago

Yes, I'm OK with accepting that LocalDateTime.from can be bug-prone here, I find that the "least worst" option. The bracketed offset seems to harm interoperability, and I would like to avoid adding more types. (Although I learned recently that Java makes a distinction between java.time.ZonedDateTime and java.time.OffsetDateTime for this reason.)

I guess new Temporal.LocalDateTime(Temporal.Absolute.from(isoString), timeZoneName) as currently described in the proof-of-concept branch would do the right thing. Maybe we need to make sure that there is a way to achieve this with Temporal.LocalDateTime.from?

ryzokuken commented 4 years ago

Summarizing my objection: I dislike the idea of catering to what seems to me a hypothetical misunderstanding against the "correct" usage.

justingrant commented 4 years ago

I think the root question is this: in 2020-08-10T03:43:36+05:30, is "+05:30" an offset or a time zone?

My concern with implicitly treating an offset as a time zone is that it silently turns LocalDateTime into DateTime, with all the disadvantages of DateTime math: results will usually look correct but will break around DST transitions.

But I also see the value of being able to easily parse and emit zoneless ISO strings for use-cases like formatting where DST is a non-issue because the underlying instant never changes after it's parsed.

Here's a few ideas to try to address both concerns:

const zoneless = Temporal.LocalDateTime.from('2020-08-10T03:43:36+05:30');
zoneless.toString(); // => 2020-08-10T03:43:36+05:30
zoneless.timeZoneOffsetString; // => "+05:30"
zoneless.timeZone; // => undefined?  null?
zoneless.plus({days: 1, hours: 12}); // throws
zoneless.with({hour: 0}); // throws
zoneless.with({timeZone: 'Asia/Kolkata', hour: 0}); // OK
zoneless.hoursInDay; // throws
zoneless.isTimeZoneOffsetTransition; // throws

const zoned = zoneless.with({timeZone: 'Asia/Kolkata'}); 
zoned.plus({days: 1, hours: 12}); // OK
zoneless.toString(); // => 2020-08-10T03:43:36+05:30[Asia/Kolkata]

const offsetZoned = zoneless.with({timeZone: '+05:30'}); // explicit offset time zone
offsetZoned.plus({days: 1, hours: 12}); // OK
offsetZoned.toString();  // => 2020-08-10T03:43:36+05:30[+05:30] (or 2020-08-10T03:43:36+05:30) ? 

AND/OR

AND/OR

@ptomato The bracketed offset seems to harm interoperability, and I would like to avoid adding more types. (Although I learned recently that Java makes a distinction between java.time.ZonedDateTime and java.time.OffsetDateTime for this reason.)

FWIW, java.time.ZonedDateTime (the Java equivalent to Temporal.LocalDateTime) accepts bracketed offsets in parsing, but won't emit them in toString(). See https://repl.it/@JustinGrant/NeglectedGraciousAlgorithms.

import java.time.ZonedDateTime;
class Main {
  public static void main(String[] args) {
    ZonedDateTime withZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[Asia/Kolkata]");
    System.out.println("with zone: " + withZone.toString());
    // => with zone: 2017-06-16T21:25:37.258+05:30[Asia/Kolkata]
    ZonedDateTime offsetOnly = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30");
    System.out.println("offset only: " + offsetOnly.toString());
    // => offset only: 2017-06-16T21:25:37.258+05:30
    ZonedDateTime offsetTimeZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[+05:30]");
    System.out.println("offset time zone: " + offsetTimeZone.toString());
    // => offset time zone: 2017-06-16T21:25:37.258+05:30
  }
}
sffc commented 4 years ago

Should we allow LocalDateTime to parse offset-only strings, but throw if the user calls methods or property getters that require a real time zone?

Personally I would be okay with this. It sounds a lot like the option I offered in #292 dubbed "partial ISO" where calendar-dependent operations would throw if a calendar wasn't specified.

Note that in the previous calendar discussions, others pointed out that given the strong typing nature of Temporal, it might be cleaner to split the type into two explicit types rather than making it have different behavior depending on whether or not the time zone (or calendar) is available.

Should LocalDateTime.from have a defaultTimeZone option

Lemma A: Developers tend to know ahead of time, or can easily obtain, the syntax of the string they are parsing. For example, developers know, or can find out, whether their strings have only an offset (-06:00) or an offset plus time zone (-06:00[America/Chicago]).

Given Lemma A, I do not believe that all of the choices in defaultTimeZone are useful. Instead of passing a TimeZone instance, offset string, or IANA string, the programmer should use Temporal.Absolute.from(str).toLocalDateTime(tz, "iso") instead.

The one choice I think is useful is "offset", allowing the programmer to "opt in" to accepting the offset as a time zone.

If we go with this option, we should discuss the naming of the option and the argument.

IMHO LocalDateTime should have some way to serialize an instance using only DateTime and offset

Without additional API, one can do this by replacing the LocalDateTime's IANA TimeZone with an offset TimeZone.

FWIW, java.time.ZonedDateTime (the Java equivalent to Temporal.LocalDateTime) accepts bracketed offsets in parsing, but won't emit them in toString()

Interesting.

ptomato commented 4 years ago

Should we allow LocalDateTime to parse offset-only strings, but throw if the user calls methods or property getters that require a real time zone?

I feel strongly against this, for the same reason that I felt strongly against the "partial ISO" proposal: for all intents and purposes it's an internal "is this object broken" flag.

Should LocalDateTime.from have a defaultTimeZone option whose value could be a TimeZone instance, an offset string, an IANA string, or a special string (e.g. 'offset') to use the input's offset as the time zone if it's omitted? Default could be to throw if the string doesn't have a bracketed zone.

I'm not in favour of that default (as per last week's discussion, I'm strongly against the default being to throw on strings that can represent a valid use case) but at first glance I think the option is a good idea.

Regardless of how the offset-only parsing is handled, IMHO LocalDateTime should have some way to serialize an instance using only DateTime and offset (w/o bracketed time zone). This is needed for interop. It could be toString({timeZone: false}), toISOString(), etc. I like the former because we could also allow toString({offset: false}) it to omit the offset in cases where users always want the time zone to win when the value is parsed back in.

Not sure how I feel about adding another option for this. It seems like (`${ldt.toDateTime()}${ldt.offsetNanoseconds}`) is easy enough to do in userland if that's necessary?

Developers tend to know ahead of time, or can easily obtain, the syntax of the string they are parsing. For example, developers know, or can find out, whether their strings have only an offset (-06:00) or an offset plus time zone (-06:00[America/Chicago]).

I don't think I agree with this lemma. I think it would encourage naive attempts such as /\[[a-zA-Z]+\/[a-zA-Z]+\]/.test(str) which would match a string in the America/Vancouver time zone but would miss America/Los_Angeles and America/Indiana/Indianapolis. That said, I think I nonetheless agree with the conclusion, that offset and reject might be the only choices needed.

sffc commented 4 years ago

I don't think I agree with this lemma. I think it would encourage naive attempts such as /\[[a-zA-Z]+\/[a-zA-Z]+\]/.test(str) which would match a string in the America/Vancouver time zone but would miss America/Los_Angeles and America/Indiana/Indianapolis.

What I meant is, I claim that most of the time, the developer knows while writing the code what format the strings are going to be in. More often than not, the strings come from the same source, whether that's a Postgres database, a CSV file, a JSON API, a date picker component, etc. The developer can look at the typical output from that source to see what is the correct Temporal parsing function to use.

ptomato commented 4 years ago

Ah, gotcha.

I also forgot to say in my earlier comment that if java.time.ZonedDateTime accepts the weird extra offset in brackets, that makes it less weird, and I would be more easily convinced to make Temporal's behaviour the same.

justingrant commented 4 years ago

This is a long thread so I'll summarize my concern: users shouldn't be able to perform math operations (or other DST-sensitive operations like .hoursInDay) without explicitly including a time zone in the string or object initializer in LocalDateTime.from.

For example, the code below should throw because it's trying to perform hybrid-duration math on an implicitly defined offset time zone. This is almost certain to be a bug.

Temporal.LocalDateTime.from('2017-06-16T21:25:37.258+05:30').plus({days: 1, hours: 12});

As long as the code above isn't allowed, I'm open to many different solutions, including:

  1. an option that allows users to opt in to implicit offset time zones (with default of 'reject' or false)
  2. require a bracketed time zone (either an IANA name or an offset) when parsing from a string
  3. Same as (2), but when parsing also accept a [] suffix to mean "use the offset as the time zone". This would be easier than callers having to parse the input string in order to pull out the offset so that it could be repeated in brackets. The [] syntax isn't accepted by Java's parser, though, so toString() would either have to omit the brackets or would put the offset in brackets.
  4. don't throw on .from, but throw on DST-sensitive methods instead. @ptomato really doesn't like this one. ;-)
  5. a Temporal.OffsetDateTime type (like Java has) which is a subset of LocalDateTime that omits math and other DST-sensitive methods. Its goal is simply an ergonomic, read-only wrapper around a (DateTime,Absolute) pair. I'm not a huge fan of this, given that it doesn't seem to add much value over what DateTime and Absolute already provide.

My preferred solution would be either (1) or (3), but I don't feel that strongly as long as the buggy math is disallowed.

IMHO the toString behavior is less consequential than the parsing behavior because there's no possibility to end up with DST-unsafe behavior regardless of how toString behaves. The worst possible outcome of toString would be breaking round-trip string serialization, which honestly doesn't seem that bad.

Depending on how we solve this problem, we may or may not want to offer the same behavior for object initializers. For example, if we choose (1) above, then I assume we'd want to offer the same option for object initializers.

@ptomato I also forgot to say in my earlier comment that if java.time.ZonedDateTime accepts the weird extra offset in brackets, that makes it less weird, and I would be more easily convinced to make Temporal's behaviour the same.

Here's another interesting tidbit: Java doesn't require the offsets to match. In other words, internally its parsing will parse the DateTime+offset string into a java.time.Instant using the provided offset, but will use the bracketed offset timezone when it comes time to serialize to string. In other words, Java's parsing has no special-casing for offset time zones-- they're treated just like any other time zone. Furthermore, there's no attempt to verify when parsing that the offset is valid for the time zone-- either for offset time zones or IANA ones. AFAIK, Java's behavior corresponds to the {offset: 'use'} option in the current implementation of LocalDateTime.from.

ZonedDateTime offsetTimeZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[+05:30]");
System.out.println("offset time zone: " + offsetTimeZone.toString());
// => offset time zone: 2017-06-16T21:25:37.258+05:30
ZonedDateTime mismatchedOffsetZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[+06:30]");
System.out.println("mismatched offset time zone: " + mismatchedOffsetZone.toString());
// => mismatched offset time zone: 2017-06-16T22:25:37.258+06:30
ZonedDateTime invalidOffsetForZone = java.time.ZonedDateTime.parse("2017-06-16T21:25:37.258+05:30[America/Los_Angeles]");
System.out.println("invalid offset for zone: " + invalidOffsetForZone.toString());
// => invalid offset for zone: 2017-06-16T08:55:37.258-07:00[America/Los_Angeles]

@ptomato Not sure how I feel about adding another option for this. It seems like (`${ldt.toDateTime()}${ldt.offsetNanoseconds}`) is easy enough to do in userland if that's necessary?

I don't feel very strongly about this one, so I'm inclined to agree with you. We could always add this later if this is a source of user confusion. BTW, the actual code is a little different:

`${ldt.toDateTime()}${ldt.timeZoneOffsetString}`

@ptomato I'm not in favour of that default (as per last week's discussion, I'm strongly against the default being to throw on strings that can represent a valid use case)

Could you explain your position in more detail? From my perspective, allowing plus on a LocalDateTime that was created with an implicit offset time zone is very, very unlikely to be a "valid use case". There's precedent elsewhere in Temporal to throw in cases where ambiguity is present and we're concerned that there's no safe default, e.g. if there's a conflict between the offset and the time zone. Here, the ambiguity is whether the offset is just an offset or is an offset time zone. Why is this ambiguity case different?

Per @sffc's comments above (which I agree with), developers are likely to know the format of the strings that they're parsing, so recovering from an exception will be trivial in most cases. This is unlike, for example, offset vs. timezone conflicts which by definition will only show up after an app has been in production long enough for time zone rules to change. If the caller gets an exception the first time they call .from with a zoneless string, it doesn't seem like a big obstacle. Could you explain more about why you are concerned that it's problematic to throw by default?

BTW, I agree that some operations are safe (aka "valid use case") on a LocalDateTime with an implicit offset time zone, but that seems like an argument for a separate OffsetDateTime type or the "partial-ISO-like" solution. IMHO, throwing by default seems to be the less confusing solution relative to either of those alternatives.

ptomato commented 4 years ago

I think my disconnect with your explanation boils down to: I don't see 2020-08-12T09:40-07:00 as a string with an offset and an "implicit" offset time zone. I just see it as a string with an offset time zone. If we give it an offset time zone object when we parse it, then we are not making any unwarranted assumptions, we are just doing what it says on the tin. @pipobscure pointed out in the meeting last week that such a string does represent a valid use case, in maritime shipping (https://en.wikipedia.org/wiki/Nautical_time). If we instead assume that the user means something else when they specify such a string, then we are requiring programmers to opt in to the correct behaviour, just because it is uncommon. That's what I object to.

(At least, the above point about nautical time zones holds if the offset is whole-hour. I'd maybe be fine with throwing on 2020-08-12T09:40-07:15, if that wouldn't make things more confusing...)

justingrant commented 4 years ago

@ptomato I don't see 2020-08-12T09:40-07:00 as a string with an offset and an "implicit" offset time zone. I just see it as a string with an offset time zone. If we instead assume that the user means something else when they specify such a string

I think that you're highlighting a core challenge with the bracketless syntax: it's ambiguous. Reasonable people can reasonably disagree about whether -07:00 above means an "offset" (meaning that it only applies to this LocalDateTime) or a "time zone" (meaning that it should be applied to other LocalDateTime values derived from this one, like the result of plus or with).

I don't think that either interpretation is wrong. Both have pros and cons. But I'm also confident that we'll see both interpretations among users. Many developers simply won't understand the difference. Given that both interpretations exist, my preference is for "offset" because it's easier for developers to realize that they have the "wrong" interpretation:

If we did go with "offset" (meaning LocalDateTime.from requires an opt-in suffix, e.g. [-07:00] or []), the developers who would get worse ergonomics would be developers legitimately using offset time zones, e.g. for ocean shipping. This is admittedly a rare case. Parsing and serialization would each require one extra method call:

Temporal.Absolute.from(s).toLocalDateTime(s);
`${ldt.toAbsolute()}${ldt.timeZoneOffsetString}`

The non-rare case is where the source data simply doesn't have time zone information. It's just a DateTime+Offset value that was lossily stored. For example, AFAIK there's no DBMS today has a native data type that stores DateTime+offset+TimeZone, so lossy storage is the norm. These values are absolutely not safe to perform LocalDateTime math, with, etc. So we should really be pushing these use-cases to DateTime and/or Absolute instead. If the data doesn't have a real time zone, LocalDateTime adds little/no value, and can actually makes things worse via DST bugs.

@ptomato such a string does represent a valid use case, in maritime shipping (https://en.wikipedia.org/wiki/Nautical_time).

FWIW, there are specific Etc/* IANA time zones for all 24 nautical time zones. The names of these time zones (e.g. GMT+7 which is -07:00) are sign-reversed which apparently aligns with nautical usage if I'm interpreting the Wikipedia article correctly.

@sffc The developer can look at the typical output from that source to see what is the correct Temporal parsing function to use.

If we went with "time zone" then what would our docs for the 2020-08-12T09:40-07:00 format say? Something like this?

This format creates an Absolute. It can also be parsed into a DateTime, Date, YearMonth, MonthDay, or Time.

NOTE: this format can also be parsed by LocalDateTime. The offset is treated as the time zone. This is appropriate for naval communications using nautical time zones and other rare use cases. But it's not appropriate for most mainstream use cases because LocalDateTime requires a real time zone to adjust for DST. For most mainstream use cases, use Absolute or DateTime, or use LocalDateTime with a real time zone.

One of the reasons I'm pushing for "offset" is to avoid the need for that second paragraph. ;-)

pipobscure commented 4 years ago

I’ll divide my response into 2 parts:

  1. An argument that does not go toward the actual topic, but aims at the validity of your argument
  2. An argument on the merits alone.

Any argument that calls upon “the 99% case” is prima facie spurious unless actually providing data and evidence to support the statement that this is in fact “the 99% case”. The argument above fails to do so.

To quote C.Hitchens: Any argument presented without evidence can be dismissed without evidence

But things get worse: the little bit of evidence provided above:

AFAIK there's no DBMS today has a native data type that stores DateTime+offset+TimeZone, so lossy storage is the norm.

could just as well serve as evidence for the other side of the argument; the fact that no DBMS has such a type indicates that “the 99% case” is that people don’t care about doing DST correct calculations; on the contrary that is something only the 1% of people writing calendaring & scheduling software ever care about. The “actual 99% case” is the one where “correct DST” calculations are actually uninteded.

Note: I did not provide any evidence for this assertion and am fine with it being dismissed without evidence. It’s sole intent was to demonstrate that the argument above was just as spurious and to be dismissed.

The argument to the merit is slightly different. Here a few of the assumptions I am making:

Based on these assumptions the conclusion must be that all temporal types should be operable without the bracket extension. This most definitely includes LocalDateTime.

Beyond that your use-case for LocalDateTime and what you imagine its distinguishing virtues to be are by far not the only ones. As such the statement:

... LocalDateTime adds little/no value, and can actually makes things worse via DST bugs.

seems blatantly incorrect to me; it’s simply not a use-case you operate with.

For the same reason I also object to the statement (from proposed documentation):

... But it's not appropriate for most mainstream use cases because LocalDateTime requires a real time zone to adjust for DST. For most mainstream use cases, use Absolute or DateTime, or use LocalDateTime with a real time zone.

To my mind the only correct part of that “second paragraph” is:

NOTE: this format can also be parsed by LocalDateTime. The offset is treated as the time zone.

Which is basically just saying:

We have created a non-standard bracket extension which we define to give additional IANA information.

and that’s needed no matter what because of the fact that we defined the non-standard bracket extension. So that “second paragraph” is necessary either way.

One could even argue that this “second paragraph” would be even more necessary for the case where we don’t accept pure ISO strings, except it would have to read:

ATTENTIONS!!! DANGER!!! LocalDateTime requires the use of a non-standard ISO/RFC string with an extension that we invented. It is therefore not interoperable with anything else in the world. So please pay attention and don’t ever use it.

And that in my mind would be the point where we’d have to reevaluate whether such a non-standard thing should be in Temporal at all.

Given the general usefulness of LocalDateTime however I’d very much regret going that route. As such I’ll stick to the proposal that we simply accept the correct simple syntax for strings that do not include a bracketed IANA zone but rather only an offset.


I hope this exposition makes as much sense to people reading it as it did when I was writing it in 33C weather. If not, I’m happy to discuss live.

sffc commented 4 years ago

If the IANA "Etc" time zones cover the nautical time use case, can we just remove the concept of arbitrary offet time zones from the spec? People can still implement them as a custom time zone if they really want them.

justingrant commented 4 years ago

I admit that I'm having trouble figuring out exactly where we agree vs. disagree on this thread. Could we try to clarify? I'll list a few statements below-- let me know which you agree vs. disagree with. They're roughly in order-- if you don't agree with one, then you probably won't agree with ones below it.

Assumption 1: Offsets are different than time zones

An "offset" applies to only one single DateTime, but it has nothing to say about the offsets of other DateTime values. Knowing the offset of one DateTime does not let you know what the offset will be one day later.

On the other hand, a "time zone" can be used to calculate the offsets of other values derived from this one, e.g. via plus, minus, difference, with, startOfDay, hoursOfDay, etc.

An "offset time zone" is a time zone that always has the same constant offset. UTC and -07:00 are examples of offset time zones.

Assumption 2: Offsets act just like time zones for static date/time values

Offsets and offset time zones act the same unless mutation is involved. Time zones only matter if you change the value, so operations that don't change the value (e.g. .year or compare()) work just as well for values with offsets vs. values with time zones.

Assumption 3: Offsets that are not timezones are unsafe for DST-sensitive operations like plus

Operations that create new values, like plus, minus, with, startofDay, hoursInDay, etc., will not return correct results if an offset that's not a time zone is assumed to be a time zone. This is not specific to Temporal; it applies to any platform. For example, if arrival_time is a DATETIMEOFFSET-typed column that originally had a real time zone before being stored in SQL Server, then the query below will cause DST bugs:

SELECT DATEADD(day, 1, arrival_time) from flights

Assumption 4: many (most?) developers won't understand the subtle difference between offsets and offset time zones

Given developer confusion about time zones and DST in general, I expect that many (maybe most) developers won't understand the subtle difference between offsets and time zones, and specifically they may not understand why math with a timezone-less offset is buggy.

Assumption 5: There are two main use cases for offsets

AFAIK, there are two main use cases using offsets: A. Reading data that originally had a time zone but was lossily stored in a relational database or any other platform (e.g. .NET) that doesn't natively store time zone info along with DateTime+offset data. B. Ocean shipping communications or similar applications using offset time zones instead of IANA Etc zones.

Assumption 6: the "Lossy Storage" use case's offset is not a time zone

If a DateTime+offset is representing a value that originally had a time zone, but the time zone was lost in persistence, then its offset is just an offset, not a time zone. This means that it's not safe to do math or other DST-sensitive operations using that offset.

Assumption 7: The "Lossy Storage" use case is much more common than "Ocean Shipping" use case, but both are important

We don't need research to know that storing temporal data in a SQL database is much more popular than oceanic transport and other similar use cases. We can argue about whether the ratio is 20:1, 100:1, or 1000:1, but I'm not sure that the actual ratio matters much, as long as we can agree that:


It's not 33C here (@pipobscure where are you vacationing?) but I have had a few lagers during Zoom calls this evening so I'll end this here before I start assuming even crazier things. ;-)

If we disagree on any of above, let's try to resolve those disagreements before moving on to debating conclusions.

ptomato commented 4 years ago

I will be away from the computer next week but in order to speed things along here are my answers:

Assumption 1: Offsets are different than time zones

Agree, but I'm not sure it's relevant. I might put it as "offset is a part of a time zone"

Assumption 2: Offsets act just like time zones for static date/time values

Agree, but this I'd maybe also put differently: "if you only have the offset of a time zone, that's enough information for a static date/time value"

Assumption 3: Offsets that are not timezones are unsafe for DST-sensitive operations like plus

Agree, and this I'd also word differently: "if you only have the offset of a time zone, that's not enough information to do arithmetic"

Assumption 4: many (most?) developers won't understand the subtle difference between offsets and offset time zones

Agree

Assumption 5: There are two main use cases for offsets

Disagree, offsets are a part of a time zone, and there are two main kinds of time zones: offset time zones and IANA time zones.

Assumption 6: the "Lossy Storage" use case's offset is not a time zone

Agree. It's a time zone that's missing some information

Assumption 7: The "Lossy Storage" use case is much more common than "Ocean Shipping" use case, but both are important

Agree

So since I agree with most of the assumptions how do I come to a different conclusion? I think the important assumption for me is that a date/time string with an offset is already an accepted way to express the ocean shipping use case, so prioritizing the lossy storage use case even if it is more common, will break the ocean shipping use case. Whereas on the other hand it would merely be inconvenient for the lossy storage case, and not even more inconvenient than it already is in other date/time facilities.

pipobscure commented 4 years ago

Agree, and this I'd also word differently: "if you only have the offset of a time zone, that's not enough information to do arithmetic"

I generally agree with mist of what you said except this one. You could have enough information to do arithmetic, but aren’t guaranteed to.

Example: America/Phoenix and -07:00 are functionally fully equivalent.

So I think the above statement is way too broad. And that’s also why I bring those objections.

pipobscure commented 4 years ago
  1. Nope: A timezone consists of a set of 1 or more offsets and the rules for when to use which. So for the case where the number of offsets in a timezone == 1 they are functionally the same thing.

  2. Nope: whether a date is static or not is irrelevant. The number of offsets per timezone is the relevant thing.

  3. Nope: That statement is too broad. If the timezone should have had more than 1 offset that may be true. But that’s far from a given.

  4. No idea. Relevance?

  5. Nope: There is 1 use-case. Doing stuff with a timezone/offset. The lossy storage thing isn’t a use-case it’s just a bug from the very start. It’s a bug in storage, conception, and pretty much everything else.

  6. See above. It’s not a use-case; it’s a bug before it ever touches JS

  7. No idea. It’s still a bug before it ever gets to JS and is therefore not something we can solve. What we certainly should not be doing is breaking perfectly fine use-cases in a vain attempt to fix something entirely outside the realm of JS. It just makes us sound like stereotypical “Beauty-Queens for World-Peace”

sffc commented 4 years ago

FWIW, my mental model aligns most closely with @justingrant's list of assumptions.

Time zones are different from offsets, even time zones that have only one offset. America/Phoenix is the time according to the local laws in that city; -07:00 means to adjust UTC time by 7 hours, which happens to correspond to the wall clock in Phoenix at all times of the year.

The local laws of Phoenix could change, and America/Phoenix would change accordingly. However, offsets are forever immutable.

justingrant commented 4 years ago

I'm still not 100% sure we've uncovered the core of our disagreement, but I think we're getting closer. I'll try to state something that I think @pipobscure and @ptomato will disagree with, and then I'll try to guess why you each will disagree with it. Here's the statement I think you'll disagree with at least part of:

There are four distinct concepts, each of which covers different use cases and will (at least in some cases) exhibit different behavior: A. An offset - Allows translating one single DateTime into an Absolute, or vice versa. Might (or might not) be sufficient to accurately derive other values e.g. via .plus({days: 1, hours: 12}). B. A time zone - rules for translating any DateTime into an Absolute, or vice versa. Always sufficient to accurately derive other values, e.g. via .plus({days: 1, hours: 12}). C. An offset time zone - a time zone based on a constant offset (e.g. -07:00, or UTC) D. A geographic time zone - a time zone based on the past, present, and future laws in a particular part of Earth, e.g. America/Phoenix

@ptomato - from your comments above, I expect that you'd disagree that (A) is different from (C), but otherwise you'd agree that each of the others are distinct concepts with distinct behavior and use cases. Is this correct?

@pipobscure - from your comments above, I expect that you'd disagree with all 4 categories above. Instead, I predict that you'd prefer two different categories: "single offset time zones" like -07:00 or America/Phoenix and "multi-offset time zones" like America/New_York. Is this correct?

justingrant commented 4 years ago

@sffc The local laws of Phoenix could change, and America/Phoenix would change accordingly.

Yep. Not only "could change", but did change:

image

For both past and future values, it's unwise for users to assume that geographic time zones and offset time zones will always be equivalent.

pipobscure commented 4 years ago

Maybe it’s useful to think about the whole Time thing from historical perspective.

At first there was no time counting as such. One would calculate in terms of dawn, noon and dusk. But once we start measuring time, noon becomes our point of reference (sun in zenith as observable). This happens at different absolute times in different places. So the true reference for any locale (in the western world) would have been the clock in the church tower. As such the timezone would have to be a fixed minute offset.

Only with the advent of trains is it useful to have a shared time across a wider area. At this point we are still not operating with legal timezones, but rather corporate timezones. (Royal Navi Time, Amtrak Time, ...) What these all share are that they are all just a fixed offset from UTC (usually dependent on celestial noon at the company headquarters).

Enter the greatest villain time has ever had: Benjamin Franklin and his invention of daylight savings time or at the very least inspiration of George Hudson the cretin who proposed modern DST.

All of a sudden the first time a timezone can contain more than one offset. Up until this point a geographic location might change its offset, but that was more akin to changing its timezone rather than the offset changing within the timezone.

Example: there is a couple of minutes offset switch in London in the 1780s. But that’s because parliament change the country to use Royal Navy Time (Greenwich east London) rather than Royal Palace Time (Richmond west London); so more a change of timezone rather than a change in timezone.

So up until around 1915 a timezone basically consisted of a single fixed offset from UTC. And to an extent this holds true even today. There is “Eastern Standard Time” which is a timezone. There is also “Eastern Daylight Savings Time” which is another timezone. So even now there is very little reason to claim that a timezone has more than a single offset. If the offset ever changes, then that’s something that only happens in rather long intervals (multiple years/decades).

So up until now timezones have existed always with a single offset. Albeit they may only apply in a location fro a part of the year. It would still be valid however to specify a time on December 1st in “Eastern Daylight Savings Time” it just would not correlate to what the clocks say in New York.

Which is where Olson comes in (whose work is later taken over by IANA). The aim of that work is to enable mapping a location on earth to a timezone. This results in “Olson-Timezones” which are specified with a continent/city tuple. We have since taken to call these things IANA Timezones or short Timezones but they aren’t. They are the bastard children of timezones. Never the less they are very useful BECAUSE they map a place (city) to a specific timezone. In fact we love these so much that we decided to make them directly accessible as the main TimeZone in Temporal.

And still, ISO8601 and a lot of other prior art is NOT using IANA timezones as their first choice. And while I’m happy and eager to have IANA zones be the primus inter pares that does not mean that it’s OK to break any workflows that do not use them. And that’s the long winded explanation with history of why I’m so vehemently opposed to forcing LocalDateTime to reject strings with missing bracket IANA names.

Here is what would convince me otherwise:

  1. The name would have to be IANADateTime
  2. There would have to be a OffsetDateTime

the first would make the object much less accessible I fear, and the second would add nothing of value that a LocalDateTime without this urge to reject perfectly valid data doesn’t supply.

ptomato commented 4 years ago

There are four distinct concepts, each of which covers different use cases and will (at least in some cases) exhibit different behavior: A. An offset - Allows translating one single DateTime into an Absolute, or vice versa. Might (or might not) be sufficient to accurately derive other values e.g. via .plus({days: 1, hours: 12}). B. A time zone - rules for translating any DateTime into an Absolute, or vice versa. Always sufficient to accurately derive other values, e.g. via .plus({days: 1, hours: 12}). C. An offset time zone - a time zone based on a constant offset (e.g. -07:00, or UTC) D. A geographic time zone - a time zone based on the past, present, and future laws in a particular part of Earth, e.g. America/Phoenix

@ptomato - from your comments above, I expect that you'd disagree that (A) is different from (C), but otherwise you'd agree that each of the others are distinct concepts with distinct behavior and use cases. Is this correct?

In my opinion:

  1. A = C (at least, for ISO strings)
  2. C ⊂ B
  3. D ⊂ B
justingrant commented 4 years ago

Quick questions for @ptomato and @pipobscure: does this issue only apply to string parsing, or to objects too? Specifically:

justingrant commented 4 years ago

More questions: What advantage does LocalDateTime provide over just Absolute and/or DateTime for the use case where you don't know the IANA time zone?

If you do know the IANA time zone but stored it separately from the ISO string (as is common with DBMS storage), then when you're deserializing you probably shouldn't go through LDT anyways. Instead, you should probably use Temporal.Absolute.from(isoString).toLocalDateTime(tz).

So I'm trying to understand the Venn diagram intersection of use cases where the IANA zone isn't known but where you do benefit from LDT features beyond what you can get from DateTime and/or Absolute.

What are those use cases?

And are those use cases common enough to outweigh the risk of users accidentally ending up in a DST-unsafe state without an easy (aka throwing) way to detect that they're in a DST-unsafe state?

pipobscure commented 4 years ago

The issue I have with the analysis is that your first two points are not true.

plus, minus, difference, with, hoursInDay, isTimeZoneOffsetTransition, etc. all can return inaccurate results unless the underlying data is from a jurisdiction without DST and the developer doesn't care about past or future offset changes in that jurisdiction.

That is simply not true; the results will all be entirely accurate for the given data. Your claim is akin to claiming:

What do we do for numbers without a decimal point? The results for plus, minus, times and divide will all be inaccurate unless those numbers resulted from an underlying number system that dictates only whole numbers.

So as to that Venn diagram: It contains any situation where you want to deal with a real-event that happened in the real world where you want to do stuff with dates/times and not loose the timezone.

pipobscure commented 4 years ago

Quick questions for @ptomato and @pipobscure: does this issue only apply to string parsing, or to objects too? Specifically:

  • Should the timeZone prop should be optional if timeZoneOffsetNanoseconds is provided in the property-bag variant of from? I don't have a strong opinion about this one because I can't think of a use case that it helps or hurts.
  • Should passing timeZoneOffsetNanoseconds to with result in downgrading the time zone to that offset? My opinion: no, because this would break the case of changing the offset of an ambiguous time to the "other" offset while not changing the time zone.
  • If the time zone is an offset time zone, then should getFields omit the timeZone prop in its output? My opinion: no, because it means that the output of getFields would be inconsistent which may not be expected by code using the result.

Agreed on all. To point 2: to change a timezone to one that only has an offset, you’d have to pass in the offset as a timezone. Also if only tomezoneOffsetNanoseconds is passed in and the resulting thing does not exist it should throw!

justingrant commented 4 years ago

@pipobscure I love your history writeup! Let's discuss the decimal point thing in real time later today. I keep thinking that we're actually agreeing about much of this but I'm not sure I understand yet where the disagreement is.

Maybe the core problem is that there are multiple valid meanings of "time zone"? If a time zone is simply an offset (like everyone thought before Olson, including the designers of all major relational databases) then the idea that time zones can change their offsets is crack-smoking crazy! Similarly, if one thinks that time zones exist mainly to simplify dealing with the scourge that is DST, then offset-only time zone seem inherently dangerous and should come with opt-in requirements and warning signs. I'm in the latter camp but that doesn't mean that the former camp is wrong. Temporal should support both. I do think we should try to help developers who get confused between those two types.

The other thing I'm stuck on is that the current LocalDateTime proposal makes a promise: if you have a LocalDateTime instance then you can always perform DST-safe arithmetic on it. This promise allows libraries and other arms-length code to accept LDT instances and DST-safely "immutate" them via arithmetic, with, etc. without worrying (or at least worrying less) about DST bugs.

Even if the time zone on the LDT instance is an offset time zone, in the current LDT proposal the only way to get that offset timezone into an LDT is to manually create it (e.g. Temporal.TimeZone.from('-07:00')) or to put the offset in brackets when parsing. Either of those is a clear opt-in signal that the developer is intentionally ignoring DST, which I think is compatible with the "promise" I mentioned above.

@ptomato, this is why I see a difference between an "offset" and an "offset time zone". I see the latter as an explicit developer opt-in that says "It's OK to use this offset to calculate the offsets of other related values", while the former implies that the developers intent is not yet known.

Anyway, here's a few possible options we can discuss.

1. Emit brackets by default in toString, require brackets by default in LocalDateTime.from, and add a from option to make brackets optional. Passing the non-default from option value means to treat the offset as an offset time zone if there's no brackets. If needed, we could also add an opt-in toString option that omits brackets. This is my preferred option, although I think (2) and (3) are OK too.

2. Start with (1) but remove the requireIANAZone option. Instead, opt in to offset time zones is less ergonomic: Temporal.Absolute.from(isoString).toLocalDateTime(Temporal.TimeZone.from(isoString)).

3. Parse bracketless strings as a "null time zone". LDT methods that only need data from LDT slots should work (e.g. all DateTime fields, all conversion methods, formatting, the offset string/ns properties, etc.), but any methods that require the time zone should throw. The .timeZone property would return null, but it'd be easy to transform to a non-null offset time zone by opting in, e.g.Temporal.LocalDateTime.from(isoString).with(timeZone: Temporal.TimeZone.from(isoString)). @ptomato I know you're not a fan of this approach. (Although playing devil's advocate, aren't obvious exceptions better than silent DST bugs that only show up in production?)

4. Add a new Temporal.OffsetDateTime type. There's a high bar to add new types, so I'm not sure that this type would add enough value over DateTime or Absolute to justify the complexity of adding a new type. But I could be convinced.

5. Don't require brackets when parsing, don't emit brackets for offset time zones in toString(), and loudly warn developers in the documentation about DST problems that will result if you "immutate" an LDT that lacks an IANA zone. This option is my least favorite because the developers most likely to cause "accidental timezone" bugs are the same developers who won't carefully read the docs.

All 5 above have a common theme: to encourage developers in a lower-frequency (non-IANA) use case to do a little extra work to prove that they're intentionally ignoring DST instead of just accidentally ignoring it.

pipobscure commented 4 years ago

I think the disagreement is on what a timezone is.

Let me try the following analogy:

Timezones are vehicles and we are discussing the building of roads.

Your claim is akin to “most roads are used by cars so we should be building roads for cars and if there is something that might inconvenience cars than that should not happen”

My claim is akin to “all vehicles should be enabled to use our roads: cars and bicycles and scooters.”

The topic of discussion is akin to “we should make this road only accessible by car or should we also provide a bicycle lane and a foot-path”

In this analogy your perspective is very “American” in that you say “most people want it to be easy to drive into this mall and allowing bicycles there will cause accidents” (“LocalDateTime is mostly used for DST and allowing non-IANA strings will cause accidents”)

I’m much more “European” in what I say: “Bicycles are valid vehicles and should be allowed to use our infrastructure equally. Accidents are caused by careless drivers not bicycles.” (“Offset-Only Timezones are equally valid timezones and they should have equal access to our infrastructure. Bugs are caused by careless developers not missing brackets.”)

I hope this helps in pinning down where the difference lays and doesn’t start more US/EU conflict 😀

On 28 Aug 2020, at 11:51, Justin Grant notifications@github.com wrote:

 @pipobscure I love your history writeup! Let's discuss the decimal point thing in real time later today. I keep thinking that we're actually agreeing about much of this but I'm not sure I understand yet where the disagreement is.

Maybe the core problem is that there are multiple valid meanings of "time zone"? If a time zone is simply an offset (like everyone thought before Olson, including the designers of all major relational databases) then the idea that time zones can change their offsets is crack-smoking crazy! Similarly, if one thinks that time zones exist mainly to simplify dealing with the scourge that is DST, then offset-only time zone seem inherently dangerous and should come with opt-in requirements and warning signs. I'm in the latter camp but that doesn't mean that the former camp is wrong. Temporal should support both. I do think we should try to help developers who get confused between those two types.

The other thing I'm stuck on is that the current LocalDateTime proposal makes a promise: if you have a LocalDateTime instance then you can always perform DST-safe arithmetic on it. This promise allows libraries and other arms-length code to accept LDT instances and DST-safely "immutate" them via arithmetic, with, etc. without worrying (or at least worrying less) about DST bugs.

Even if the time zone on the LDT instance is an offset time zone, in the current LDT proposal the only way to get that offset timezone into an LDT is to manually create it (e.g. Temporal.TimeZone.from('-07:00')) or to put the offset in brackets when parsing. Either of those is a clear opt-in signal that the developer is intentionally ignoring DST, which I think is compatible with the "promise" I mentioned above.

@ptomato, this is why I see a difference between an "offset" and an "offset time zone". I see the latter as an explicit developer opt-in that says "It's OK to use this offset to calculate the offsets of other related values", while the former implies that the developers intent is not yet known.

Anyway, here's a few possible options we can discuss.

  1. Require brackets when parsing LDT, always emit brackets in toString, and add a from option (e.g. requireIANAZone: true (default) | false? ). If needed, we could also add an opt-in toString option that omits brackets. This is my preferred option, although I think (2) and (3) are OK too.

  2. Start with (1) but remove the requireIANAZone option. Instead, opt in to offset time zones is less ergonomic: Temporal.Absolute.from(isoString).toLocalDateTime(Temporal.TimeZone.from(isoString)).

  3. Parse bracketless strings as a "null time zone". LDT methods that only need data from LDT slots should work (e.g. all DateTime fields, all conversion methods, formatting, etc.), but any methods that require the time zone should throw. The .timeZone property would return null, but it'd be easy to transform to a non-null offset time zone by opting in, e.g.Temporal.LocalDateTime.from(isoString).with(timeZone: Temporal.TimeZone.from(isoString)). @ptomato I know you're not a fan of this approach. (Although playing devil's advocate, aren't obvious exceptions better than silent DST bugs that only show up in production?)

  4. Add a new Temporal.OffsetDateTime type. There's a high bar to add new types, so I'm not sure that this type would add enough value over DateTime or Absolute to justify the complexity of adding a new type. But I could be convinced.

  5. Don't require brackets when parsing, don't emit brackets for offset time zones in toString(), and loudly warn developers in the documentation about DST problems that will result if you "immutate" an LDT that lacks an IANA zone. This option is my least favorite because the developers most likely to cause "accidental timezone" bugs are the same developers who won't carefully read the docs.

All 5 above have a common theme: to encourage developers in a lower-frequency (non-IANA) use case to do a little extra work to prove that they're intentionally ignoring DST instead of just accidentally ignoring it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

pipobscure commented 4 years ago

And as you will have guessed by now I’m firmly in the camp of #5 to the extent that I think that the others (apart from maybe #4) are unacceptable. And even #4 has a strong whiff of “separate but equal” to it that leaves me with a bad taste.

On 28 Aug 2020, at 11:51, Justin Grant notifications@github.com wrote:

 @pipobscure I love your history writeup! Let's discuss the decimal point thing in real time later today. I keep thinking that we're actually agreeing about much of this but I'm not sure I understand yet where the disagreement is.

Maybe the core problem is that there are multiple valid meanings of "time zone"? If a time zone is simply an offset (like everyone thought before Olson, including the designers of all major relational databases) then the idea that time zones can change their offsets is crack-smoking crazy! Similarly, if one thinks that time zones exist mainly to simplify dealing with the scourge that is DST, then offset-only time zone seem inherently dangerous and should come with opt-in requirements and warning signs. I'm in the latter camp but that doesn't mean that the former camp is wrong. Temporal should support both. I do think we should try to help developers who get confused between those two types.

The other thing I'm stuck on is that the current LocalDateTime proposal makes a promise: if you have a LocalDateTime instance then you can always perform DST-safe arithmetic on it. This promise allows libraries and other arms-length code to accept LDT instances and DST-safely "immutate" them via arithmetic, with, etc. without worrying (or at least worrying less) about DST bugs.

Even if the time zone on the LDT instance is an offset time zone, in the current LDT proposal the only way to get that offset timezone into an LDT is to manually create it (e.g. Temporal.TimeZone.from('-07:00')) or to put the offset in brackets when parsing. Either of those is a clear opt-in signal that the developer is intentionally ignoring DST, which I think is compatible with the "promise" I mentioned above.

@ptomato, this is why I see a difference between an "offset" and an "offset time zone". I see the latter as an explicit developer opt-in that says "It's OK to use this offset to calculate the offsets of other related values", while the former implies that the developers intent is not yet known.

Anyway, here's a few possible options we can discuss.

  1. Require brackets when parsing LDT, always emit brackets in toString, and add a from option (e.g. requireIANAZone: true (default) | false? ). If needed, we could also add an opt-in toString option that omits brackets. This is my preferred option, although I think (2) and (3) are OK too.

  2. Start with (1) but remove the requireIANAZone option. Instead, opt in to offset time zones is less ergonomic: Temporal.Absolute.from(isoString).toLocalDateTime(Temporal.TimeZone.from(isoString)).

  3. Parse bracketless strings as a "null time zone". LDT methods that only need data from LDT slots should work (e.g. all DateTime fields, all conversion methods, formatting, etc.), but any methods that require the time zone should throw. The .timeZone property would return null, but it'd be easy to transform to a non-null offset time zone by opting in, e.g.Temporal.LocalDateTime.from(isoString).with(timeZone: Temporal.TimeZone.from(isoString)). @ptomato I know you're not a fan of this approach. (Although playing devil's advocate, aren't obvious exceptions better than silent DST bugs that only show up in production?)

  4. Add a new Temporal.OffsetDateTime type. There's a high bar to add new types, so I'm not sure that this type would add enough value over DateTime or Absolute to justify the complexity of adding a new type. But I could be convinced.

  5. Don't require brackets when parsing, don't emit brackets for offset time zones in toString(), and loudly warn developers in the documentation about DST problems that will result if you "immutate" an LDT that lacks an IANA zone. This option is my least favorite because the developers most likely to cause "accidental timezone" bugs are the same developers who won't carefully read the docs.

All 5 above have a common theme: to encourage developers in a lower-frequency (non-IANA) use case to do a little extra work to prove that they're intentionally ignoring DST instead of just accidentally ignoring it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

ptomato commented 4 years ago

Quick questions for @ptomato and @pipobscure: does this issue only apply to string parsing

Yes, for me it only applies to string parsing.

justingrant commented 4 years ago

There are four distinct concepts, each of which covers different use cases and will (at least in some cases) exhibit different behavior: A. An offset - Allows translating one single DateTime into an Absolute, or vice versa. Might (or might not) be sufficient to accurately derive other values e.g. via .plus({days: 1, hours: 12}). B. A time zone - rules for translating any DateTime into an Absolute, or vice versa. Always sufficient to accurately derive other values, e.g. via .plus({days: 1, hours: 12}). C. An offset time zone - a time zone based on a constant offset (e.g. -07:00, or UTC) D. A geographic time zone - a time zone based on the past, present, and future laws in a particular part of Earth, e.g. America/Phoenix

@ptomato - from your comments above, I expect that you'd disagree that (A) is different from (C), but otherwise you'd agree that each of the others are distinct concepts with distinct behavior and use cases. Is this correct?

In my opinion:

  1. A = C (at least, for ISO strings)
  2. C ⊂ B
  3. D ⊂ B

I liked the road analogy so much, here's another one to try to explain why I think offsets and offset time zones are different.

In the US, major non-freeway roads often have both a route number and a street name. For example, a street near my house is both "Ashby Avenue" and "California Route 13". Confusingly, they don't always match. Drive a few miles uphill on California Route 13 and it turns into "Tunnel Road". Same route number, different street name. This is analogous to India or China where (for recent and near-future dates at least) the offset and the IANA zone always match.

image

But it gets worse. At least in the transition from Ashby Avenue to Tunnel Road, you keep driving straight so the transition is seamless. In other places, you'll often see cases where the route number doesn't go straight. For example, here's an intersection that family drives through every summer on our way to a resort area:

image

If you want to stay on California Route 120 while driving East through Oakdale, you must turn left. If you keep going straight, you'll get lost. And after you make that left turn, you're now on a road called F Street which is also California Route 108 and California Route 120. These two route numbers split later down the road, and F Street stops being F Street at the city limits. This intersection is like a time zone with DST.

Now imagine you stop on the road to Oakdale at a roadside store to ask for for directions to a location past Oakdale. The shopkeeper says "go South 20 miles on this road." But which road? The physical asphalt, regardless of its name or number? Route 120? Route 108? F Street? Unless you know what "this road" means, you may not find the destination.

In this analogy:

The key is that a time zone lets you give reliable directions (to add or subtract distance from your current position), while an offset only tells you were you are right now. It gives you no information about how to get to anywhere else from here.

I think that this distinction is important.

sffc commented 4 years ago

Maybe instead of trying to convince one another to switch their point of view, we should consider everyone's mental models on their merits about which one most closely aligns with the goals of Temporal. If we want to prioritize bug-free code and nudging developers to do the right thing via API design, then we adopt @justingrant's preferred behavior. If we want to prioritize flexibility and an emphasis on developer education, then we adopt @pipobscure's preferred behavior.

justingrant commented 4 years ago

@pipobscure and I are going to chat later today, so let's see if the two of us can come to consensus. If not then @sffc's approach sounds good to me. I think you've accurately captured the essence of our respective POVs. I'm new here so don't have a good sense of which direction Temporal's goals point, but I'm content to follow the group either way.

One more gratuitous road analogy: do we want to require all bicyclists to wear a "helmet" like [] or {zone: 'fromOffset'} because many bike riders will crash if the road suddenly swerves 1m to the left or right? Or do we want to live with some crashes in return for an easier ride for experienced bicyclists who know what they're doing or who ride in places where the roads are always straight?

There are good arguments on both sides. IMHO it comes down to how easy it is to put on a helmet vs. the impact to the ecosystem of bike crashes. (I'm assuming that "cars" in this analogy won't crash thanks to the IANA RoadSmoother™ 😃)

justingrant commented 4 years ago

@pipobscure and I met this morning and hashed out our differences and came to consensus-- not a grudging consensus but full agreement! Thanks Philipp for taking the time to meet; it was worth it. Below is a concrete proposal representing what we agreed on. Feedback appreciated.

The core insight in our discussions was that a plain offset was the domain of Absolute, while an offset time zone is the domain of LocalDateTime. The way a developer expresses their intent to use an offset vs. an offset time zone is to choose the corresponding Temporal type for that intent. Once we agreed on that point, and once we realized that Java's parser accepted bracketed offsets, the rest fell into place.

1. LocalDateTime.prototype.toString will by default emit all time zone identifiers in brackets: IANA zones, offset zones, or custom time zone IDs.

2. LocalDateTime.prototype.toString will have an option to control whether brackets are emitted or omitted.

3. LocalDateTime.from will throw if provided a bracketless string, because a bracketless format is the domain of Absolute. Throwing makes it easier for new users to understand the correct Temporal types to use for particular use cases and string formats. If the caller does want to turn a standard ISO string into a LocalDateTime, the canonical code is this:

Absolute.from(isoString).toLocalDateTime(TimeZone.from(isoString))

4. In LocalDateTime.from, if the time zone in brackets is an offset then the offset in the ISO string must exactly match the offset in brackets. The current proposed default behavior of LocalDateTime.from will reject cases where the time zone in brackets doesn't match the offset in the ISO string, so this bullet point means that we'll treat offset time zones just like IANA time zones when parsing. Users can override this behavior via the offset option, as we discussed in the last meeting. If I choose {offset: 'use'} then the offset in the ISO string will be used. If I choose {offset: 'timeZone'} or {offset: 'prefer'} then the offset in brackets will be used. This also matches the current proposed behavior for IANA time zones.

5. Absolute.from will ignore brackets. This is already what we decided in the last meeting and @ptomato has a pending PR to make this change, but wanted to re-iterate it here to close the loop.

6. Absolute.prorotype.toString will never emit brackets. If given a time zone parameter, it will emit an offset only (no brackets). If there's no time zone parameter, it will emit Z. @ptomato - does this match the behavior of the PR you prepared for this case?

7. Although it's out of scope for the Temporal proposal (and this GitHub issue!), we thought that eventually standardizing this extension to ISO8601 and/or perhaps RFC 3339 would be a good idea. Nothing in the bullet points above should impede standardization, and because Java supports the same string format and if .NET does too, that may make it easier to later standardize for other consumers too.

ptomato commented 4 years ago

Congratulations! I'm glad we're able to remove this last obstacle. Here are some replies:

  • 1.1 This means that LDT instances with custom time zones are not round-trippable via LocalDateTime.from. The caller will have to deserialize via Absolute.from and the custom time zone constructor. Both offset and IANA time zones will be round-trippable by default.

We had already decided that deserializing custom time zones in Absolute.from (which will now no longer be possible) would not work by default, so that's OK. But we had also decided that you should be able to make your custom time zone globally available by monkeypatching Temporal.TimeZone.from, and if you do that, then I believe LocalDateTime.from should 'just work' for deserializing custom time zones as well.

  • 2.3 Following Temporal's patterns elsewhere, we'll use an options property bag with a string-valued option (not a boolean).

I don't think disallowing booleans was a conscious decision, it's just that we never had any boolean options yet. Intl has boolean options as well. I would prefer a boolean option to a 'yes' | 'no' option in any case. (I can see why you might want to prefer string options for future extensibility, but 'yes' | 'no' impedes extensibility as well.)

5. Absolute.from will ignore brackets. This is already what I believe we decided in the last meeting but we wanted to make sure this was the current consensus. @ptomato - does this match the behavior of the PR you prepared for this case?

Yes.

6. Absolute.prorotype.toString will never emit brackets. If given a time zone parameter, it will emit an offset only (no brackets). If there's no time zone parameter, it will emit Z. @ptomato - does this match the behavior of the PR you prepared for this case?

No, in #741 we decided to remove the parameter altogether, and the output will always contain Z. If you want to output a string with a time zone then you have to convert to LocalDateTime.

justingrant commented 4 years ago
  • 1.1 This means that LDT instances with custom time zones are not round-trippable via LocalDateTime.from. The caller will have to deserialize via Absolute.from and the custom time zone constructor. Both offset and IANA time zones will be round-trippable by default.

We had already decided that deserializing custom time zones in Absolute.from (which will now no longer be possible) would not work by default, so that's OK. But we had also decided that you should be able to make your custom time zone globally available by monkeypatching Temporal.TimeZone.from, and if you do that, then I believe LocalDateTime.from should 'just work' for deserializing custom time zones as well.

Sounds good. I'll edit 1.1 accordingly.

  • 2.3 Following Temporal's patterns elsewhere, we'll use an options property bag with a string-valued option (not a boolean).

I don't think disallowing booleans was a conscious decision, it's just that we never had any boolean options yet. Intl has boolean options as well. I would prefer a boolean option to a 'yes' | 'no' option in any case. (I can see why you might want to prefer string options for future extensibility, but 'yes' | 'no' impedes extensibility as well.)

Agreed. I updated 2.3 accordingly. Boolean seems much better than yes/no if we go with a bi-state option. Could we add bikeshedding on the name and shape of this option (see 2.5) to the agenda for this week's meetings?

6. Absolute.prorotype.toString will never emit brackets. If given a time zone parameter, it will emit an offset only (no brackets). If there's no time zone parameter, it will emit Z. @ptomato - does this match the behavior of the PR you prepared for this case?

No, in #741 we decided to remove the parameter altogether, and the output will always contain Z. If you want to output a string with a time zone then you have to convert to LocalDateTime.

I think we should revisit this decision. @pipobscure's point (which I agree with) is that Absolute is the natural home for emitting both bracketless formats-- either Z or an offset. Let's discuss in meetings this week.

ptomato commented 4 years ago

My bikeshed preference: I have no strong preference but I like format, and I like the distinction of 'full' (everything needed to round trip) vs 'minimal' (no extensions to ISO format). Not sure about hideNonISOCalendar and hideTimeZone, they seem a bit unwieldy. Another synonym for 'minimal' could be 'strict'.

I don't think we need hideNonISOCalendar anyway, if you don't want the calendar in the output then you can either choose the minimal format or convert using withCalendar('iso8601').

justingrant commented 4 years ago

Given the discussion about arbitrary extensions to the string format in #293, should we make the option more generic to be able to apply to any extension? EDIT 2020-09-04: "more generic" is my current preference. For example:

ldt.toString({omit: ['timeZone']}); 
ldt.toString({omit: 'timeZone'}); // no array needed if only one value?
ldt.toString({omit: ['timeZone', 'u-ca', 'x-foo']}); // see #293 
ldt.toString({omit: ['offset']});  // if we want to allow this, per discussion in #869

Another synonym for 'minimal' could be 'strict'.

I worry about this for the same reason as 'standard': our plan is to extend the standard, so in 10 years will it really be the "strict" choice?