tc39 / proposal-temporal

Provides standard objects and functions for working with dates and times.
https://tc39.es/proposal-temporal/docs/
Other
3.29k stars 149 forks source link

why isn't a datetime string like `2024-08-24T14:00:00-05` allowed to be parsed into a `ZonedDateTime`? #2930

Open BurntSushi opened 3 weeks ago

BurntSushi commented 3 weeks ago

In particular, one possible result of parsing 2024-08-24T14:00:00-05 would be as if it were written 2024-08-24T14:00:00-05[-05]?

Some users of Jiff are running into this restriction and I guess it seems somewhat limiting for the sake of being limiting in some respect. In particular, a ZonedDateTime isn't just limited to an instant in a time zone, but it can be an instant in a fixed offset as well. So it seems like it would be fine to accept just a 2024-08-24T14:00:00-05 as-is. Is this meant more as a gentle nudging toward using RFC 9557 everywhere? Or is there a footgun here that this restriction is preventing that I'm not seeing?

And if Jiff were to allow parsing of 2024-08-24T14:00:00-05 directly into its analog of ZonedDateTime, how bad of a deviation do you think that would be in your opinion?

nekevss commented 3 weeks ago

Huh, isn't 2024-08-24T14:00:00-05[-05] supported by the specification?

Temporal grammar lists UTCOffset as taking ASCIISign Hour as a valid, and the same goes for RFFC3339 and RFC9557 by extension. Although, I could definitely be missing something.

FWIW, I did test the below cases in ixdtf locally, and they seem to pass as well.

#[test]
fn test_hour_utc_offset() {
    let tz_test = "2024-08-24T14:00:00-05[-05]";
    let result = IxdtfParser::from_str(&tz_test).parse();
    assert!(result.is_ok());

    let tz_test = "2024-08-24T14:00:00-05";
    let result = IxdtfParser::from_str(&tz_test).parse();
    assert!(result.is_ok());
}

So at least from that perspective, there wouldn't be any deviation :smile:

BurntSushi commented 3 weeks ago

Huh, isn't 2024-08-24T14:00:00-05[-05] supported by the specification?

Just to be clear, do you mean 2024-08-24T14:00:00-05?

If you try to parse 2024-08-24T14:00:00-05 using Temporal.ZonedDateTime.from on the documentation page, it fails with an error indicating that a bracketed time zone is required. Is that version of Temporal out of sync with the spec?

BurntSushi commented 3 weeks ago

It's not the grammar I'm asking about here. It makes sense that the lower level parsing routine accepts it. It's about what ZonedDateTime can be parsed from specifically.

nekevss commented 3 weeks ago

Oh, sorry! :sweat_smile: The documentation isn't out of sync. ZonedDateTime in Temporal does require TimeZoneAnnotation according to the grammar.

But I'd 100% defer to the champions on this in general.

justingrant commented 2 weeks ago

is there a footgun here that this restriction is preventing that I'm not seeing?

Yep! There is definitely a footgun.

If you have a ZonedDateTime object, then callers expect that they can derive other ZDT objects from it, e.g. by adding a Duration, by using with, etc. In order for those derived objects to be accurate, the original ZDT needs to know how offsets will change for any Instant in the resulting derived object. In other words, the original ZDT needs to know its time zone in order for derived objects to be correct.

If Temporal allowed parsing RFC3339 strings like 2024-08-24T14:00:00-05:00 into a ZonedDateTime, then any code that received a ZDT object could no longer know whether the time zone in the ZDT was an actual time zone, or whether an ignorant or lazy developer just parsed a timezone-less timestamp string.

In other words, if a developer tries to parse a timezone-less string into a ZDT, it's almost certainly a programmer bug not an intentional bypassing of the ZDT guardrails. Temporal is pretty agressive across the API at making "likely bug" cases illegal. For example, you also can't parse a string like 2024-08-24T14:00:00Z into a PlainDateTime because per RFC9557 the Z means "I don't know the time zone" so it would be a mistake for developers to assume that the calendar date or wall-clock time of that string is meaningful.

Similarly, to prevent programmer bugs from timezone-less strings, we disallow parsing strings into a ZDT unless there's a time zone annotation. This ensures that developers won't accidentally parse a timezone-less string and then treat the resulting ZDT as if it had a real time zone. They can still fake it using an offset time zone, but they have to opt into extra work to do it, which makes is somewhat more likely that they actually are intending to break the rules and are aware of the consequences of doing so: that downstream consumers of that ZDT can't rely on its time zone to derive more ZDT objects.

Instead, users who have RFC3339 strings should be parsing those strings into Temporal.Instant objects, and then using instant.withTimeZone() to add the actual time zone of the timestamp to create a ZDT that can be used to derive other ZDT objects.

Out of curiosity, what are the use cases where people are wanting to parse these strings into a ZDT? Why isn't Instant good enough?

If the need is just to get the individual components (offset, year, month, day, hour, ...) from an RFC3339 string, then a developer can use this (intentionally less ergonomic than regular parsing in order to discourage accidental use) code like this:

function getOffsetOfRFC3339String(s) {
  return Temporal.Instant.from(s).toZonedDateTimeISO(s).offset;
}
getOffsetOfRFC3339String('2024-08-24T14:00:00-05:00')
// => '-05:00'

I'd strongly advise against passing ZDTs parsed in this way to any other code, because it's not really a time zone and will break downstream code.

justingrant commented 2 weeks ago

And if Jiff were to allow parsing of 2024-08-24T14:00:00-05 directly into its analog of ZonedDateTime, how bad of a deviation do you think that would be in your opinion?

Sorry, forgot to answer this question directly in my earlier comment: I think this would be bad! It would mean that Jiff's ZDT could not be trusted by downstream callers, because there'd be no way to know if the time zone of the ZDT is a real human time zone, or a programmer bug where they parsed a timezone-less string as if it had a real time zone.

BurntSushi commented 2 weeks ago

Thanks for the reply @justingrant! Much appreciated. I now understand the motivation on Temporal's side here much better.

If the need is just to get the individual components (offset, year, month, day, hour, ...) from an RFC3339 string, then a developer can use this (intentionally less ergonomic than regular parsing in order to discourage accidental use) code like this:

function getOffsetOfRFC3339String(s) {
  return Temporal.Instant.from(s).toZonedDateTimeISO(s).offset;
}
getOffsetOfRFC3339String('2024-08-24T14:00:00-05:00')
// => '-05:00'

Ah yeah this is interesting. The downside with this though is that it isn't just less ergonomic, but it's also more expensive since it appears to need to parse s twice. Jiff might have a different weighting of the trade-offs here than Temporal. With that said, you can achieve the equivalent with Jiff's strptime API.

I also didn't realize you could do that with the Temporal API. I'm not sure if the docs for Instant.toZonedDateTimeISO are out of sync or not, but I'm not sure that behavior is something I would have inferred from the current docs.

Out of curiosity, what are the use cases where people are wanting to parse these strings into a ZDT? Why isn't Instant good enough?

I am still trying to understand that myself. Here's one example. You'll notice that I phrase "I do not understand" in a few different ways. :-)

I filed this issue because I also wanted to make sure I understood things from the other side too here, even if the use cases end up not being compelling enough.

Temporal is pretty agressive across the API at making "likely bug" cases illegal.

Yeah this makes sense. It's a big part of what attracted me to Temporal. :-)

Sorry, forgot to answer this question directly in my earlier comment: I think this would be bad! It would mean that Jiff's ZDT could not be trusted by downstream callers, because there'd be no way to know if the time zone of the ZDT is a real human time zone, or a programmer bug where they parsed a timezone-less string as if it had a real time zone.

I think this is the crux of the matter. And in particular:

This ensures that developers won't accidentally parse a timezone-less string and then treat the resulting ZDT as if it had a real time zone.

The way I conceptualize this is that the [-05] in 2024-08-24T14:00:00-05[-05] is an extra assertion meant to convey, "yes, I know what I am doing, this is a timezone-less string but it's okay in this specific case."

My sense here is that it might be worth Jiff growing a opt-in option to parse a ZDT from a timezone-less RFC3339 instant (accepting a fixed offset, but not Zulu). I think this would match the spirit of "intentionally less ergonomic than regular parsing in order to discourage accidental use" but without requiring double-parsing. It would look something like this:

use jiff::fmt::temporal::DateTimeParser;

static PARSER: DateTimeParser = DateTimeParser::new().time_zone_annotation_requried(false);

let zdt = PARSER.parsed_zoned("2024-08-24T14:00:00-05").unwrap();
assert_eq!(zdt.offset(), jiff::tz::offset(-5));

This is a lot more verbose than the typical way to parse a ZDT in Jiff. That is, "2024-08-24T14:00:00-05".parse::<Zoned>() would still fail.

justingrant commented 2 weeks ago

I also didn't realize you could do that with the Temporal API. I'm not sure if the docs for Instant.toZonedDateTimeISO are out of sync or not, but I'm not sure that behavior is something I would have inferred from the current docs.

It's not specific to Instant.toZonedDateTimeISO](https://tc39.es/proposal-temporal/docs/instant.html#toZonedDateTimeISO). Anywhere time zones are accepted as a property or function argument, you can pass any of the following:

The way I conceptualize this is that the [-05] in 2024-08-24T14:00:00-05[-05] is an extra assertion meant to convey, "yes, I know what I am doing, this is a timezone-less string but it's okay in this specific case."

That's correct, and because the caller has gone through the extra effort to opt in, then ZDT can parse it.

I think this would match the spirit of "intentionally less ergonomic than regular parsing in order to discourage accidental use" but without requiring double-parsing.

I guess I would first ask: what use cases require a ZDT in this case as opposed to an Instant? What are callers trying to do with the returned ZDT that they can't do with an Instant?

If the issue is purely a parsing thing ("I want to know every individual field of an RFC 9557 string) then you may be better served by a parsing-only function instead of a function that returns a ZDT.

BurntSushi commented 2 weeks ago

I guess I would first ask: what use cases require a ZDT in this case as opposed to an Instant? What are callers trying to do with the returned ZDT that they can't do with an Instant?

Yes, I agree. The use cases need to be flushed out more.

If the issue is purely a parsing thing ("I want to know every individual field of an RFC 9557 string) then you may be better served by a parsing-only function instead of a function that returns a ZDT.

That's a possibility too. Jiff supports this via its BrokenDownTime via the strptime API. And one can craft strptime-format strings that resemble the Temporal ISO 8601 grammar, but not every corner case. So it might make sense to have a similar BrokenDownTime type that can be parsed from Jiff's implementation of Temporal's ISO 8601 grammar. I think my hang-up there is that I believe there are ambiguous cases between "parse something that contains a date" and "parse something that is only a time." But I'd have to go back and check that.

justingrant commented 2 weeks ago

Yes, I agree. The use cases need to be flushed out more.

Yep, that's the place to start. Let's better understand what users are trying to do, and why, and then we can figure out if ZDT is the right solution for those cases. Do you want to post a comment here when you learn more about the use cases behind these asks?

BurntSushi commented 2 weeks ago

Yes, will do.

francois08 commented 5 days ago

@justingrant I see that you are a contributor to RFC 9557. Question:

If I want to store a datetime in the future (e.g. for a future meeting), like "2035-11-08 14:00:00" in Belgium,

  1. why isn't it possible to write:
    2035-11-08T14:00:00[Europe/Brussels]

    Omitting the offset, specifying at this local time in Europe/Brussels, whatever the offset from UTC at that time (we don't know for sure; who knows, DST is probably abolished in EU by that time). Instead, to be compliant, it seems I would need to "guess" the offset (as DST rules might change):

    2035-11-08T14:00:00+01:00[Europe/Brussels]
  2. if we must write the offset, why can't we a specify a priority? E.g. in that same example:
    2035-11-08T14:00:00+01:00[Europe/Brussels]

    I know for sure the time zone identifier is correct, but not the offset. I want to mention: resolve the date time first by the provided time zone ID. We have the exclamation mark [!Europe/Brussels] but that only says in a vague way that the inconsistency must be resolved.

ptomato commented 5 days ago

@francois08 RFC 9557 is for exact time strings, and 2035-11-08T14:00:00[Europe/Brussels] isn't an exact time. You can write a parser that resolves the wall-clock time in this string to an exact time in whatever way you prefer (and we have this functionality in Temporal) but the string itself is just serialized data; it can't carry a meaning, it only has a meaning in the context of an application that parses it.

francois08 commented 5 days ago

@ptomato I assumed the point of this standard is to address the lack of time zone information in ISO 8601 and RFC 3339, where we lose daylight saving time details. But DST and exact times don't go well together as DST is uncertain/variable for future date times.

ptomato commented 5 days ago

@francois08 Nonetheless, 2035-11-08T14:00:00+01:00[Europe/Brussels] is still an exact time even if the time zone changes the offset: namely, 2035-11-08T13:00Z.