Kotlin / kotlinx-datetime

KotlinX multiplatform date/time library
Apache License 2.0
2.37k stars 97 forks source link

Add an OffsetDateTime type #90

Closed cromefire closed 6 months ago

cromefire commented 3 years ago

it would be quite good if there was a type (maybe let's call it OffsetDateTime to distinguish it from a possible ZonedDateTime) that can represent a LocalDateTime and TimeZone ZoneOffset together. This is especially useful for parsing RFC3339 compatible strings in common code (e.g. https://github.com/OpenAPITools/openapi-generator/pull/7353#discussion_r551024354).

It can easily be accomplished without any platform-specific code, I've already drafted a small example with should be enough for most use-cases (but uses some custom parsing logic, which will only work for RFC3339 compatible strings)

Example ```kotlin package com.example import kotlinx.datetime.* import kotlin.jvm.JvmStatic /** * Represents a [LocalDateTime] and the respective [ZoneOffset] of it. */ public class OffsetDateTime private constructor(public val dateTime: LocalDateTime, public val offset: ZoneOffset) { override fun toString(): String { return if (offset.totalSeconds == 0) { "${dateTime}Z" } else { "$dateTime$offset" } } /** * Converts the [OffsetDateTime] to an [Instant]. This looses the [ZoneOffset] information, because the date and time * is converted to UTC in the process. */ public fun toInstant(): Instant = dateTime.toInstant(offset) /** * Returns a new [OffsetDateTime] with the given [TimeZone]. */ public fun atZoneSameInstant(newTimeZone: TimeZone): OffsetDateTime { val instant = dateTime.toInstant(offset) val newDateTime = instant.toLocalDateTime(newTimeZone) return OffsetDateTime(newDateTime, newTimeZone.offsetAt(instant)) } public companion object { private val zoneRegex by lazy { Regex("""[+\-][0-9]{2}:[0-9]{2}""") } /** * Parses an [OffsetDateTime] from a RFC3339 compatible string. */ @JvmStatic public fun parse(string: String): OffsetDateTime = when { string.contains('Z') -> OffsetDateTime( LocalDateTime.parse(string.substringBefore('Z')), TimeZone.UTC.offsetAt(Instant.fromEpochMilliseconds(0)), ) string.contains('z') -> OffsetDateTime( LocalDateTime.parse(string.substringBefore('z')), TimeZone.UTC.offsetAt(Instant.fromEpochMilliseconds(0)), ) zoneRegex.matches(string) -> { val dateTime = LocalDateTime.parse(string.substring(0, string.length - 6)) val tz = TimeZone.of(string.substring(string.length - 6)) val instant = dateTime.toInstant(tz) val offset = tz.offsetAt(instant) OffsetDateTime( dateTime, offset, ) } else -> throw IllegalArgumentException("Date \"$string\" is not RFC3339 compatible") } /** * Creates an [OffsetDateTime] from an [Instant] in a given [TimeZone] ([TimeZone.UTC] by default). */ @JvmStatic public fun ofInstant(instant: Instant, offset: TimeZone = TimeZone.UTC): OffsetDateTime = OffsetDateTime( instant.toLocalDateTime(offset), offset.offsetAt(instant), ) /** * */ @JvmStatic public fun of(dateTime: LocalDateTime, offset: ZoneOffset): OffsetDateTime = OffsetDateTime(dateTime, offset) } } ```
cromefire commented 3 years ago

It would probably be useful to use an ZoneOffset for this, but it cannot be (directly) parsed currently

cromefire commented 3 years ago

conversion from and to java.time is also easy (using the example):

import kotlinx.datetime.toJavaLocalDateTime
import kotlinx.datetime.toJavaZoneOffset
import kotlinx.datetime.toKotlinLocalDateTime
import kotlinx.datetime.toKotlinZoneOffset

public fun java.time.OffsetDateTime.toKotlinOffsetDateTime(): OffsetDateTime {
    return OffsetDateTime.of(
        toLocalDateTime().toKotlinLocalDateTime(),
        toOffsetTime().offset.toKotlinZoneOffset()
    )
}

public fun OffsetDateTime.toJavaOffsetDateTime(): java.time.OffsetDateTime {
    return java.time.OffsetDateTime.of(dateTime.toJavaLocalDateTime(), offset.toJavaZoneOffset())
}
ilya-g commented 3 years ago

So did I get it right that you need such a type to represent deserialized values of the built-in date-time OpenAPI format? Why is it important to preserve the offset information in this case? What can happen if the Instant type is used instead?

cromefire commented 3 years ago

So not only specifically for OpenAPI, but in general to represent RFC3339 Datetimes (which is meant to be the Internet standard). You can of course use an Instant here, but that means loosing the offset information. I don't have a specific use-case at hand right now, but I'm sure there was a reason RFC3339 was designed this way and I can think of some cases where the offset could be useful (for example if the server already converts it to the user local time as configured in the server or this information could also represent the local time of the server). I don't know if it's smart to use this, but as in this specific case it's often about existing APIs, this could render APIs unusable. You might find some more use-cases in the sources of the RFC. Also as far as I'm aware Instant also can't parse RFC3339.

justasm commented 3 years ago

One specific use-case - our API deals with public transport departure times (bus, metro, etc) in various cities. Public transport departure times (like flights) are always displayed in the local city time. The API returns times conforming to RFC3339 such as 2020-07-31T09:16:15+02:00. If these are parsed and represented as an Instant, we need an additional redundant data point - the zone offset - to convert to and display LocalDateTimes to the user.

This does not imply that it is necessary to have a kotlinx.datetime.OffsetDateTime, but at the very least it would be useful to have a parsing function that returns a LocalDateTime and ZoneOffset from an RFC3339 time.

straight-shoota commented 3 years ago

FWIW: Crystal's equivalent of Instant always carries a location property to designate the time zone instance. Internal represenatation is normalized to UTC, so for example comparison between instances with different locations doesn't require conversion. The time zone information is only used for interaction with a human-readable datetime representation (parsing, stringfigication, calendrical arithmetics). This has proven to be very practical and enables use cases like the one described here without adding too much complexity, and actually simplifying the API because you don't need to pass a TimeZone instance to every method which operates time-zone aware.

harry248 commented 3 years ago

Came here searching for a multiplatform replacement for java.time.ZonedDateTime. It's really surprising that kotlinx-datetime isn't equally able to parse an ISO-8601 and/or a RFC-3339 date string that includes a non-zulu time-zone. Or am I getting it wrong?

hrach commented 3 years ago

Fixed in recent release: https://github.com/Kotlin/kotlinx-datetime/releases/tag/v0.2.0 Sorry, I mistaken this for another issue.

harry248 commented 3 years ago

Fixed in recent release: https://github.com/Kotlin/kotlinx-datetime/releases/tag/v0.2.0

But parsing it to an instant the time zone is getting lost isn't it?

dkhalanskyjb commented 3 years ago

The parsed time zone is not returned to the user, yes.

cromefire commented 3 years ago

Which does result in the loss of information

harry248 commented 3 years ago

Then unfortunately this is not a solution / fix for us, as we have to keep the time zone.

dkhalanskyjb commented 3 years ago

We do want to support returning the parsed offsets, but the exact implementation is unclear: maybe it's sufficient to just return a Pair, maybe something heavier, like ZonedDateTime, needs to be introduced, or something in-between would be the best choice.

To decide this, we need to consider the specific use cases. @harry248, could you please share yours? @justasm already mentioned that an API they're dealing with returns the instants with a time zone that also indicates how the instant should be displayed to the end-user—this is a good point, thanks!

harry248 commented 3 years ago

@dkhalanskyjb It's exactly the same scenario in our case.

i-walker commented 3 years ago

My use-case for OffsetDateTime is to parse data from a database. So serialization and deserialization are sufficient. But Ideally we can override them automatically thorugh kotlinx

fluidsonic commented 3 years ago

I've had one (rare & weak) use case for OffsetDateTime so far, just recently: I had to parse an ISO 8601 timestamp from a 3rd party API. While I could parse it as Instant I also needed to know the offset in that case because that API doesn't provide a proper time zone. However I've asked them to include it as that would make more sense and be more reliable than using offsets.

That use case is more or less https://github.com/Kotlin/kotlinx-datetime/issues/90#issuecomment-758736441 where also the API isn't providing an ideal data format. It should probably provide a local datetime + time zone instead of datetime merged with an offset.

In general over many different projects there hasn't been any instance where I've needed ZonedDateTime. In most cases where a TimeZone is needed it's stored and passed around separately anyway. For example for each insurance policy we store a TimeZone that's used for all LocalDate(Time) instances affecting that insurance. No need for ZonedDateTime here. TimeZone is typically part of a context (e.g. the insurance policy, a calendar entry, a geographical location) and thus not linked directly to individual Date(Time) instances. That was true for all projects so far.

An offset I've never needed except for the very rare use case mentioned initially.

wkornewald commented 2 years ago

Our use-case is to be able to losslessly serialize/deserialize and work with arbitrary future datetimes with zone and DST information like java.time.ZonedDateTime already does:

  1. offset only (e.g. +02:00)
  2. zone id only (e.g. Europe/Berlin)
  3. zone id with a flag for DST vs. standard (e.g. [Europe/Berlin][DST], [Europe/Berlin][ST] or maybe +02:00[Europe/Berlin])

Only (3) can correctly represent a future event/datetime if the zone's offset and DST handling gets changed unexpectedly. Using separate LocalDateTime and TimeZone and dst flags is unnecessarily complicated and error-prone.

Finally, with such a flexible ZonedDateTime maybe there is no need for a separate OffsetDateTime (I never had a need for that at least).

dkhalanskyjb commented 2 years ago

Hi @wkornewald!

Only (3) can correctly represent a future event/datetime if the zone's offset and DST handling gets changed unexpectedly.

Could you elaborate on this? I don't see why this is true. Consider two cases here:

Am I missing some case here where LocalDateTime + TimeZone + ZoneOffset are all important?

In general, IIRC, this is the first time someone asks that we handle the DST in a specific manner.

wkornewald commented 2 years ago

Note that point (3) is just an edge case to be able to represent a precise umanbiguous clock readings time. My main suggestion is to have a ZonedDateTime that at least fulfills (1) and (2), but additionally being able to losslessly represent serialized string-based dates produced by java.time would be nice.

Regarding (3): If you want to trigger an event at an exact point in clock readings time then you have two different interpretations. How do you distinguish the point in time before the DST change vs. the time after the change? 02:30 could mean two points in time.

dkhalanskyjb commented 2 years ago

How do you distinguish the point in time before the DST change vs. the time after the change? 02:30 could mean two points in time.

Tricky.

wkornewald commented 2 years ago

Imagine a „simple“ calendar app that wants to allow users to specify the DST flag when an event occurs at some ambiguous time. The exact day or time when the switch to DST occurs might change in the future and an Instant can’t deal with that, so the only option would be ZonedDateTime with a nullable DST Boolean (part of TimeZone?).

wkornewald commented 2 years ago

But you're right, this is pretty obscure and indeed maybe not worth it - though then we can't be fully compatible with everything representable by java.time then. Anyway, my main request was having a ZonedDateTime where the zone can be either an id or an offset and I just continued to think of obscure edge cases and then the DST question came up, so I compared how java.time handles that. We're discussing the DST point too much and let's better focus on the core ZonedDateTime aspect. 😄

dkhalanskyjb commented 2 years ago

Imagine a „simple“ calendar app that wants to allow users to specify the DST flag

Can't really imagine that. I'd expect the calendar app to provide a time picker, which, if the author of the calendar so chose, would have some times repeated.

we can't be fully compatible with everything representable by java.time then

Oh, I think it's not a viable goal at all, given how vast the Java Time API is. We want this library to be simple and discoverable, not all-encompassing.

let's better focus on the core ZonedDateTime aspect

Ok, sure. In your case, since, it looks like, you can choose the format in which to send data, right? Then, I think a good choice depending on the circumstance is either an Instant for representing specific moments (+ an optional TimeZone if the way the datetimes are displayed should also be controlled) or LocalDateTime + TimeZone for representing "the moment when the wall clock in the given time zone displays this". What do you think?

cromefire commented 2 years ago

I compared how java.time handles that. We're discussing the DST point too much and let's better focus on the core ZonedDateTime aspect.

I think aligning with java's behavior makes a lot of sense instead of rolling your own and dealing with a lot of inconsistent behavior between different services.

LocalDateTime + TimeZone for representing "the moment when the wall clock in the given time zone displays this".

This seems to basically be how Java implements it and is also how I work around it right now.

wkornewald commented 2 years ago

let's better focus on the core ZonedDateTime aspect

Ok, sure. In your case, since, it looks like, you can choose the format in which to send data, right? Then, I think a good choice depending on the circumstance is either an Instant for representing specific moments (+ an optional TimeZone if the way the datetimes are displayed should also be controlled) or LocalDateTime + TimeZone for representing "the moment when the wall clock in the given time zone displays this". What do you think?

I can't fully choose the format because we have an existing system and already pre-existing official specifications that we have to support.

Actively juggling with LocalDateTime and TimeZone as separate values is definitely not enough.

We had to build our own ZonedDateTime to deal with this and I'm pretty sure everyone else is doing the same thing because this is a very common use-case. Other libraries already provide it out of the box and I think this library should do it, too.

wkornewald commented 2 years ago

If you want to keep the API as small as possible then here’s some provocative food for thought: If you had ZonedDateTime instead of Instant maybe people wouldn’t be missing Instant because ZonedDateTime in UTC handles that use-case already.

The question is where you want to make distinctions explicit via types and maybe where you want to provide common base types to allow abstracting the distinction away.

cromefire commented 2 years ago

If you want to keep the API as small as possible then here’s some provocative food for thought: If you had ZonedDateTime instead of Instant maybe people wouldn’t be missing Instant because ZonedDateTime in UTC handles that use-case already.

The question is where you want to make distinctions explicit via types and maybe where you want to provide common base types to allow abstracting the distinction away.

The use-case for instant is different: It should be use to record past events that may not change (e.g. the Unix timestamp cannot change) while ZonedDateTimes can (e.g. if laws change), which is why they should be used to record things like future appointments, so maybe not impossible to use a common type, you have to be extra careful to not break those functionalities.

wkornewald commented 2 years ago

You mean that laws can change the meaning of the past for a ZonedDateTime (in UTC)?

cromefire commented 2 years ago

You mean that laws can change the meaning of the past for a ZonedDateTime (in UTC)?

Yes. For UTC this might not be a problem, but for other timezones it might (see the efforts on DST). Just remember to think about all the use cases before trying to implement something.

wkornewald commented 2 years ago

That’s the point I was trying to provoke. ZonedDateTime in UTC can handle the Instant use-case.

Personally, I’d only still want to have the distinction of both types so I can enforce the UTC timezone via types where that is critical.

cromefire commented 2 years ago

That’s the point I was trying to provoke. ZonedDateTime in UTC can handle the Instant use-case.

Personally, I’d only still want to have the distinction of both types so I can enforce the UTC timezone via types where that is critical.

Well I that case you can just use a LocalDateTime, because the Timezone is always UTC and they just define an interface that both implement.

cromefire commented 2 years ago

I've had a look at java and they both implement Temporal. It implements some changing of parts, addition, subtraction and time difference. Not having a compare function (and maybe even no common interface at all) might make sense so people have to explicitly have to convert to Instant so the dev has to understand that they are comparing two types for different purposes and basically has to opt-in to this conversion rather than have it happen implicitly. That way the type could be pretty light weight any basically only be a wrapper for LocalDateTime and TimeZone, with comparison, equals (maybe 2 functions because you'd probably want the default equals to check if the time zones and the date time is the same) and a difference calculation or so.

wkornewald commented 2 years ago

I've had a look at java and they both implement Temporal. It implements some changing of parts, addition, subtraction and time difference. Not having a compare function (and maybe even no common interface at all) might make sense so people have to explicitly have to convert to Instant so the dev has to understand that they are comparing two types for different purposes and basically has to opt-in to this conversion rather than have it happen implicitly. That way the type could be pretty light weight any basically only be a wrapper for LocalDateTime and TimeZone, with comparison, equals (maybe 2 functions because you'd probably want the default equals to check if the time zones and the date time is the same) and a difference calculation or so.

Our implementation is actually a rather lightweight wrapper (a few hundred lines) with support for parsing/formatting, handling offset vs. zone id, comparison, etc. And I also agree that for some operations it could be better to make the type difference explicit. That would need some more thought...

We have even more complicated use-cases because we work with medical data that can contain arbitrarily precise datetimes (year, year-month, year-month-day, full datetime without zone or with zone) and we had to abstract even over that, so you can e.g. sort by date easily or format to a human-friendly string etc. Anyway, that's a different topic.

The use-case for instant is different: It should be use to record past events that may not change (e.g. the Unix timestamp cannot change) while ZonedDateTimes can (e.g. if laws change), which is why they should be used to record things like future appointments, so maybe not impossible to use a common type, you have to be extra careful to not break those functionalities.

I've thought about this some more and would like to challenge several assumptions. Do you have any examples which show that it's not an extremely obscure edge case to have timezone details of the past get changed?

We can also look at it this way: If you create an event for a future ZonedDateTime and then two years later the event moves to the past, do you suddenly have to convert it to an Instant, so it still refers to the same actual point in time? I've never heard of something strange like that. People just keep the ZonedDateTime and happily represent the past without any problems.

Finally, there's always this implicit assumption in the whole discussion and the README of this project:

But the reality is more like

So the real question for deciding between ZonedDateTime and Instant is: Do you know the original timezone and do your storage schema and communication APIs support timezones? If the answer is no then you pick Instant. For everything else you pick ZonedDateTime.

In most user-facing apps I see practically no use for Instant. In fact, in some previous projects where people started with Instant it happened that we later found out that this was a mistake and ZonedDateTime would have allowed an important analysis (e.g. when the user wakes up, when he takes some medicine, etc.). DST changes happen and users travel between countries. In many cases it's useful to not just look at the data in UTC, but actually know the local time and the timezone of all events even in the past. Of course there are also cases for Instant like storing backend metrics. But I think this library is making a mistake by putting so much emphasis on Instant. ZonedDateTime should be your default choice for everything - past and future. Every use of Instant needs deliberate thought because you can't go back once you've lost the timezone information.

cromefire commented 2 years ago

Yes, I think you are right. What I was thinking of for Instants is something like a "last modified" in a database or something similar, where it's more important to have a (easily comparable and transferrable) timestamp, where you care about the distance and order (which is way easier without timezones) and not where it was recorded. Examples would be caches, logs if you log in UTC, distributed systems, etc.. when face the user you'd probably just convert those to Local datetime using the user's local time zone.

Edit: A thing where past time zones might change would be if you had a library for example that wasn't updated (in time). While this is probably rather rare and a bug, Instants (often serialized as unix time) just provide that security against that, for use cases, where it's required.

wkornewald commented 2 years ago

Right, so maybe the rule of thumb is

The opinions in this discussion probably just reflect what kind of data each person works with the most.

cromefire commented 2 years ago

Basically in the end we'd need both anyway and they probably shouldn't be mixed without explicit conversion (so no common API on the public API). Both should retain their properties and I guess for most users it's not important how it's implemented under the hood as long as it's fast and it works.

dkhalanskyjb commented 2 years ago

First of all, thank you both for the discussion! It will really be of huge help when/if we arrive at some conclusion.

LocalDateTime + TimeZone for representing "the moment when the wall clock in the given time zone displays this".

This seems to basically be how Java implements it and is also how I work around it right now.

Not really. ZonedDateTime is equivalent to Instant + TimeZone. LocalDateTime + TimeZone is not enough: if some LocalDateTime happens twice in a given time zone, LocalDateTime + TimeZone won't be enough to distinguish between them.

Of course, LocalDateTime + FixedOffsetTimeZone, for example, is unambiguous.

Actively juggling with LocalDateTime and TimeZone as separate values is definitely not enough.

How do you compare dates?

If you need to compare moments in time, compare Instant values. If you just want to see which clock reading goes first, compare LocalDateTime. The former is more useful.

Comparing ZonedDateTime is like comparing Instant values, but much more complex with no clear benefit. First, ZonedDateTimes are compared as Instant values. If they are equal, then they are compared as LocalDateTime values. If those are equal, the identifiers of the time zones are lexicographically compared.

I think what they were going for is to establish some order on the ZonedDateTime values, but is it really any more practical than just comparing Instant values? Why?

How do you parse and format dates?

That's a real issue. We're working on the parsing/formatting API. However, I don't see how ZonedDateTime vs separately using TimeZone relates to this.

How do you serialize/deserialize? This is almost always a single field in the serialized data (e.g. JSON).

I assume you mean a single primitive field. Could you explain where this requirement comes from? I don't see why "2021-12-14T09:48:54.486500+01:00[Europe/Berlin]" is better than {"moment":"2021-12-14T08:49:40.958189Z","zone":"Europe/Berlin"}. The string already has these components, they are just made explicit in the JSON structure.

If anything, ZonedDateTime is at a disadvantage here. Unless every participant is careful, the timezone databases between them may go out of sync, that is, they may have different opinions about the time zone rules. What should happen with "2021-12-14T09:48:54.486500+01:00[Europe/Berlin]" on deserialization if your system for some reason thinks that, on 2021-12-14, the offset in Berlin is +03:00? Who knows what will happen? On the other hand, Instant + TimeZone is completely transparent, and it's obvious what will happen on desync: if the TimeZone is used to create a LocalDateTime, that LocalDateTime will be different across the machines.

In fact, ZonedDateTime typically deals with invalid inputs the same way, by adjusting the LocalDateTime part.

How do you make sure no mistakes happen because people forget to take the TimeZone into account in the whole codebase with tons of developers of varying experience?

We don't provide LocalDateTime arithmetic, so there's not much you can do without it aside from parsing and printing. If one wants to, say, advance a datetime by two days, it is done with Instant. I don't think there's anything in our library that doesn't require a time zone when it needs to be taken into account.

If you want to keep the API as small as possible [...]

This is a non-goal. If this was a goal, then, on the extreme, we would have a mishmash omnipotent datetime class which, depending on the context, would represent a period, or a date, or a moment in time, as NSDateComponents does. Instead, we want something simple and discoverable; omnipotent objects are not known to be that.

[...] then here’s some provocative food for thought: If you had ZonedDateTime instead of Instant maybe people wouldn’t be missing Instant because ZonedDateTime in UTC handles that use-case already.

Given how several datetime libraries already did this, we, of course, thought about this and decided against it. Instant is simple, predictable, and portable; ZonedDateTime is neither of those, and it didn't look like it brought anything compared to a tandem of Instant + TimeZone aside from parsing/formatting.

The use-case for instant is different: It should be use to record past events that may not change (e.g. the Unix timestamp cannot change) while ZonedDateTimes can (e.g. if laws change)

Not the case. Instant is used for all events when their specific moment is known. For example, if I set the microwave to two minutes from now, the moment when it beeps will be represented as an Instant. If the paper submission deadline is "2021-11-19, 23:59:59 anywhere-on-earth", it's an Instant, because it is a specific moment, the meaning of which is obvious to everyone, regardless of their time zone.

If, however, a friend says "let's meet tomorrow at 2 p.m.", this is a LocalDateTime: regardless of what time zone it is and whether there's a time gap or a time overlap that morning, we'll still going to meet when the clock in our town shows 2 p.m. Unless that moment falls into a time gap, of course, but then we'll have to reschedule the meeting in any case, can't be helped.

If, however, a friend says "call me tomorrow at the time when it's 2 p.m. in New York", it's an Instant again, as "the next time it's 2 p.m. in New York" is a specific moment.

Well in that case you can just use a LocalDateTime because the Timezone is always UTC and they just define an interface that both implement.

Not really. In our library, LocalDateTime is not specified to be in UTC; it is in an unspecified time zone. If it were UTC, we would be able to implement arithmetic on it, as it would be well-defined. However, since people rarely use LocalDateTime to represent the clock readings in UTC, and instead it's almost always used to represent the clock readings in some unspecified time zone, providing arithmetic operations would lead to bugs when the implicit time zone does differ from UTC.

If you create an event for a future ZonedDateTime and then two years later the event moves to the past, do you suddenly have to convert it to an Instant, so it still refers to the same actual point in time? I've never heard of something strange like that.

Again, this depends on how the moment is defined. If you need something to happen on "2023-12-14, 10:48 MSK", then you need something to happen on "2023-12-14, 10:48" + "MSK", so LocalDateTime + TimeZone. If you need something to happen an exact amount of seconds into the future (which is unlikely in the stated case of two years into the future), it's an Instant.

Finally, there's always this implicit assumption in the whole discussion and the README of this project:

  • Instant: past only
  • LocalDateTime+TimeZone: future (only?)

Oh. Could you list the parts of the README that led to this assumption (preferably in a separate issue) so we could fix this? The intention is for Instant to represent moments in time, and for LocalDateTime to represent clock readings, regardless of whether it's the past or the future.

Of course, if the meeting was at 2 p.m. yesterday, we already know which time zone rules were in effect and can represent this as an Instant, but if the meeting is scheduled to happen at 2 p.m. exactly two years from now, we can't be sure that the time zone rules won't change, and so can't calculate the number of seconds until that event beforehand. So, there is a class of things that can be represented as an Instant in the past but not the future, but I'd argue this is not the defining properties of Instant and LocalDateTime.

So the real question for deciding between ZonedDateTime and Instant is: Do you know the original timezone and do your storage schema and communication APIs support timezones? If the answer is no then you pick Instant. For everything else you pick ZonedDateTime.

Since Instant + TimeZone is equivalent to ZonedDateTime, I don't see why this question would even arise.

In fact, in some previous projects where people started with Instant it happened that we later found out that this was a mistake and ZonedDateTime would have allowed an important analysis

This is very interesting. Do you know of some functionality of ZonedDateTime that can't be easily implemented with Instant + TimeZone? Please do share an example! This would make the argument for introducing ZonedDateTime much more strong.

DST changes happen and users travel between countries. In many cases, it's useful to not just look at the data in UTC, but actually know the local time and the timezone of all events even in the past.

Why do you need to know the timezone of all events in the past? Isn't it enough to know precisely when they happened? If you also need to know where someone was at that moment, this is a separate piece of data and has nothing to do with the time of the event.

Also, since users travel between countries, Instant becomes that much more useful. "The headache started ten hours ago; also, since the user has to take a pill every 48 hours, the next time they should do it is in 8 hours". These are statements that are nicely represented as Instant values, and to show them to the user, you can use the time zone of the place where the user currently resides. Storing a time zone alongside these would be error-prone: it would be easy to accidentally use the stored time zone and not the current one to display the data.

ZonedDateTime should be your default choice for everything - past and future. Every use of Instant needs deliberate thought because you can't go back once you've lost the timezone information.

Could you elaborate on when the time zone is an inherent trait of some moment in time and not just side information?

where you care about the distance and order (which is way easier without timezones) and not where it was recorded

When do you care in which time zone something was recorded? Time zones don't map neatly to geography, after all, so… what meaningful things can one do by knowing the time zone?

when face the user you'd probably just convert those to Local datetime using the user's local time zone.

Exactly! When would you not do that?

Right, so maybe the rule of thumb is

  • ZonedDateTime: user data
  • Instant: machine data (logs, metrics, cache expiry, etc)

Could you elaborate on this as well? Do users want to see "this happened at 2021-11-19, 23:59:59 in the MSK time zone"?

Both should retain their properties

The issue is, we don't have both yet, we only have Instant, and we're trying to decide what good, if any, could come out of also introducing a ZonedDateTime.

wkornewald commented 2 years ago

How do you serialize/deserialize? This is almost always a single field in the serialized data (e.g. JSON).

I assume you mean a single primitive field. Could you explain where this requirement comes from? I don't see why "2021-12-14T09:48:54.486500+01:00[Europe/Berlin]" is better than {"moment":"2021-12-14T08:49:40.958189Z","zone":"Europe/Berlin"}. The string already has these components, they are just made explicit in the JSON structure.

We have existing APIs, third-party APIs and official medical formats that we have to support. We can't tell third-parties and standards bodies to change their formats and we won't refactor a multi-million LOC project (which is using ZoneDateTime for all user data).

Also, 99% of the APIs in the world use a single field instead of splitting it up because it is in fact more convenient to work with a single object. The convenience of a primary use-case is the main point.

Also, you can do date arithmetic more easily on a combined object and e.g. add one month and get the same local time - adjusted for DST changes.

Most programming languages, frameworks, ORMs and even OpenAPI/Swagger work with a single value for ZonedDateTime. We won't change the official OpenAPI specification, the official Email format, the official HTTP header formats and instruct the whole world to switch to a different structure. Practically every standard uses ZonedDateTime, so it should be easy to work with that kind of data.

Splitting everything up into two values definitely is not the easiest solution imaginable.

DST changes happen and users travel between countries. In many cases, it's useful to not just look at the data in UTC, but actually know the local time and the timezone of all events even in the past.

Why do you need to know the timezone of all events in the past? Isn't it enough to know precisely when they happened? If you also need to know where someone was at that moment, this is a separate piece of data and has nothing to do with the time of the event.

Also, since users travel between countries, Instant becomes that much more useful. "The headache started ten hours ago; also, since the user has to take a pill every 48 hours, the next time they should do it is in 8 hours". These are statements that are nicely represented as Instant values, and to show them to the user, you can use the time zone of the place where the user currently resides. Storing a time zone alongside these would be error-prone: it would be easy to accidentally use the stored time zone and not the current one to display the data.

Depending on where you travel the "in 8 hours" could mean "while you're asleep". That definitely won't happen for people in the real world. If someone has to do something every time he wakes up or before going to sleep then these actions shift with the time zone. However, for analysis it's still important to additionally know the absolute time differences, so LocalDateTime alone is not enough and neither is Instant alone. We really need ZonedDateTime to get the full picture.

ZonedDateTime should be your default choice for everything - past and future. Every use of Instant needs deliberate thought because you can't go back once you've lost the timezone information.

Could you elaborate on when the time zone is an inherent trait of some moment in time and not just side information?

It's really good that the people who define specifications don't just say "use UTC for everything", but instead they usually add time zone information.

Good API design optimizes for the most common use-cases and makes them fool-proof and trivially easy. Apparently ZonedDateTime is a very common use-case (almost every official standard uses it), so the point of having a ZonedDateTime is to make this as easy as possible.

I don't know how else to explain it. Your suggestion sounds to me like having separate date + time (instead of Instant) because it's "just as easy". No it's not.

dkhalanskyjb commented 2 years ago

existing APIs

No argument here. It's obvious that we'll need to provide parsing/formatting that recognizes time zones. The question is, do we have to implement a whole ZonedDateTime? If ZonedDateTime is needed just for compatibility, we could instead parse/format a Pair<Instant, TimeZone>, for example. If there are some other use cases, maybe adding ZonedDateTime is the right answer. Or maybe there's some other data structure that's even better at representing the use cases.

Also, 99% of the APIs in the world use a single field instead of splitting it up because it is in fact more convenient to work with a single object.

If this were true, then everyone would send everything as a single object. The reason they don't is twofold:

ZonedDateTime ticks the "everyone has to support parsing/formatting it" box, but is the Instant + TimeZone more than just the sum of its parts?

Also, you can do date arithmetic more easily on a combined object and e.g. add one month and get the same local time - adjusted for DST changes.

In our library, we provide Instant arithmetic that also allows this.

LocalDateTime alone is not enough and neither is Instant alone. We really need ZonedDateTime to get the full picture.

This argument is not sufficient: maybe you actually need both LocalDateTime and Instant, but separately. For cleaner code, if nothing else. LocalDateTime would represent the user-visible time of the action, whereas Instant would be the actual moment in time when the action happened.

However, you later say that LocalDateTime can be also useful for analysis in your case. This does call for using ZonedDateTime, I agree: the user sees the LocalDateTime part, you see what the user saw, and also the Instant—sounds like it's convenient to bundle everything together.

It's really good that the people who define specifications don't just say "use UTC for everything", but instead they usually add time zone information.

It's not really viable to use UTC for everything. For example, you wouldn't be able to conveniently schedule a meeting using just UTC in a way that adapted to the time zone changes.

Good API design optimizes for the most common use-cases and makes them fool-proof and trivially easy.

Sure, this is what we are going for.

Apparently ZonedDateTime is a very common use-case (almost every official standard uses it)

Maybe it does indicate a common use case, maybe it only indicates a historical coincidence. In any case, the ease of representing something does not indicate the ease of using something.

so the point of having a ZonedDateTime is to make this as easy as possible.

Yeah, we will need to do something about compatibility. The question is, what exactly.

Your suggestion sounds to me like having separate date + time (instead of Instant) because it's "just as easy". No it's not.

I'm not claiming that it's just as easy. I'm claiming that it's not clear whether it's just as easy, more difficult, or easier. I can see arguments for why using ZonedDateTime is more difficult, and in this discussion, I'm trying to understand why using ZonedDateTime could be easier than using Instant + TimeZone.

Flight departure and arrival

I think those are represented naturally as "they happen at these moments; also, the time zones of the source and the destination are this and that".

Calendar invites often contain a time zone

This one is good, I agree that there's some model of the ground truth that invites the use of ZonedDateTime: if the meeting time is represented as a LocalDateTime, but should be accessed in other time zones as well, then, yeah, it makes sense to know the LocalDateTime and the time zone simultaneously.

We'll need to look at how calendar meetings are actually defined in real systems: whether they use LocalDateTime as the source of truth, and if so, whether they handle the situations when the time zone of the meeting or the time zone of the viewer changes their rules.

time zone information in an email

"Happened at a given moment; also, the sender was in that time zone; also, it was at that particular moment for the sender". I think there's a pattern emerging in your use cases: we want to bundle everything together if we need to know the local date-time for someone in a particular time zone, but are also interested in the real moment.

Thanks, this indeed looks like a common use case that's better covered by a ZonedDateTime than Instant + TimeZone! However, I don't see still any evidence that there's a need to do arithmetic on this, for example. It looks like the only use cases we've discussed are for sending, receiving, and changing the representation of some instants. Maybe we could implement it in some way that's less cumbersome than a full-fledged ZonedDateTime.

cromefire commented 2 years ago

No argument here. It's obvious that we'll need to provide parsing/formatting that recognizes time zones. The question is, do we have to implement a whole ZonedDateTime? If ZonedDateTime is needed just for compatibility, we could instead parse/format a Pair\u003CInstant, TimeZone>, for example. If there are some other use cases, maybe adding ZonedDateTime is the right answer. Or maybe there's some other data structure that's even better at representing the use cases.

I'd say that even for this case it'd be worth it to add at least a rather lightweight wrapper. I'd pre really confused to see an API that takes a Pair<LocalDateTime, TimeZone>. Everybody would ask what's this doing here? Of course everyone could implement their own wrapper object, but then we'd have potentially 10s of wrapper objects and we'd land in conversion hell, where we get one object from API 1 and need to convert it to a pair and then to another object from API 2. That's just a horrible experience.

Also Instant and TimeZone doesn't work, because if you'd schedule let's say a meeting in a year and now some rules may change. You expect it's always e.g. at 13:00, because that's what you set. Now you use an Instant and suddenly it's at 14:00. That... seems not like your user would expect it. They don't want to reschedule everything by an hour because of a change. For future events that are important that they happen in exactly 1 Year, so 365 days, x hours, n seconds, you'd need an instant, where the time zone isn't really part of the Date itself, but an additional indicator. And then you'd also need a wrapper for conversation to an instant for example, but what's important is that the conversation to Instant and TimeZone may not be lossless. If you'd pick a date now in 10 years, save it as RFC3339 and Instant + TimeZone, in a few years those two values would refer to a different time, because an Instant is absolute, but the ZonedDateTime should be relative to the calendar, otherwise it shouldn't be parsed from an RFC3339 (which is the source of this issue), because that could fundamentally change the underlying value. In practice of course this would only happen if you'd change rules while the code is running or serialize both components, but I'd argue that the altering of the meaning in this way should be an explicit conversion.

wkornewald commented 2 years ago

Calendar invites often contain a time zone

This one is good, I agree that there's some model of the ground truth that invites the use of ZonedDateTime: if the meeting time is represented as a LocalDateTime, but should be accessed in other time zones as well, then, yeah, it makes sense to know the LocalDateTime and the time zone simultaneously.

We'll need to look at how calendar meetings are actually defined in real systems: whether they use LocalDateTime as the source of truth, and if so, whether they handle the situations when the time zone of the meeting or the time zone of the viewer changes their rules.

My calendar software, for example, shows LocalDateTime by default, but it has a checkbox to define the time zone of the start date and time zone of the end date.

time zone information in an email

"Happened at a given moment; also, the sender was in that time zone; also, it was at that particular moment for the sender". I think there's a pattern emerging in your use cases: we want to bundle everything together if we need to know the local date-time for someone in a particular time zone, but are also interested in the real moment.

Yes, we store all end-user data like that, so we can later add new features that can take both pieces of information into account.

Thanks, this indeed looks like a common use case that's better covered by a ZonedDateTime than Instant + TimeZone! However, I don't see still any evidence that there's a need to do arithmetic on this, for example. It looks like the only use cases we've discussed are for sending, receiving, and changing the representation of some instants. Maybe we could implement it in some way that's less cumbersome than a full-fledged ZonedDateTime.

The arithmetic becomes necessary once you start to store all end-user data and other data with with ZonedDateTime. Then you need to calculate e.g. if a minimum number of days have passed or if the current date is within the validity range.

But even if it's not directly end-user data, here is an extract from the certlogic rules for validating the European Union's COVID vaccination certificates:

{"EngineVersion":"0.7.5","ValidTo":"2030-06-01T00:00:00Z","ValidFrom":"2021-08-19T00:00:00+02:00","Engine":"CERTLOGIC","Version":"1.1.0", ...

See a snapshot of the full file here: https://github.com/Digitaler-Impfnachweis/covpass-android/blob/main/covpass-sdk/src/main/assets/covpass-sdk/eu-rules.json

As you can see, the ValidTo and ValidFrom use different time zones. In other places we also do arithmetic over ZonedDateTime to check if a certain minimum amount of time has passed, etc. This project is just a little bit too small to show all the possible real-world complexities and unfortunately I can't show you any examples from the main projects. But it's pretty normal to have some object's createdOn in ZonedDateTime and then in the UI show the duration like "1 hour ago" which already requires arithmetic - and again we wouldn't want to manually convert to Instant to be able to do arithmetic.

BTW, in many cases I've seen offset-based ZonedDateTime, but I've also had zone id based ZonedDateTime. So, ZonedDateTime would need to handle both cases transparently (at least my own implementation of that class does).

A combined offset + zone id might be too much of an edge case, I don't know. Personally, I'd even add that because java.time surely has it for a reason and it's needed to handle DST correctly and e.g. auto-correct zone database changes (and possibly even missing updates which you talked about as an edge case in your previous message).

cromefire commented 2 years ago

and again we wouldn't want to manually convert to Instant to be able to do arithmetic.

You should. Because if not using an offset, but a full-fledged time, there's no guarantee that's after a server has been restarted for example the "in 2 hours", will still be true. There might be a change of rules (I'm aware this wouldn't be that close usually, but samples with huge numbers aren't easy to understand) and now it's "in 1 hour" or "in 3 hours". Yes this should be a rare occasion, but I think it absolutely makes sense to do, because edge cases in datetimes are an absolute PITA. It just doesn't make sense to compare an Instant (right now) to a ZonedDateTime (calendar date) directly. A calendar is an artificial conversation scheme where as an Instant is a (as long as the atomic clocks work fine) basically absolute point in time. Therefore to compare them you'd have to implicitly convert them and I don't think an implicit conversion would be a good idea. If you are dealing with past times it should be safe to convert to an Instant and a TimeZone which would be a separate class, which in turn would be safe to compare to an Instant because it's unambiguous unlike ZonedDateTime, maybe we'd need a different issue for that.

wkornewald commented 2 years ago

If I do the toInstant conversion by hand then how is that different from the ZonedDateTime class doing the conversion automatically? The problem stays the same.

Anyway, I wasn’t even talking about comparison/arithmetic over ZonedDateTime and Instant but between two ZonedDateTime objects.

wkornewald commented 2 years ago

Also, why are we even discussing extreme edge cases? Are you running servers for weeks/months without updates? Then missing time zone updates are the least of your problems and you should be caring about security updates (e.g. log4j) a lot more.

Also, time zone changes usually don’t happen over night. Time zone updates can be created months in advance, so there’s enough time to update all machines.

And if you have IoT devices operating without internet/updates for prolonged periods you can store ZonedDateTime with offset + zone ID and maybe even DST flag and auto-correct the data once the update is applied.

cromefire commented 2 years ago

Also, why are we even discussing extreme edge cases? Are you running servers for weeks/months without updates? Then missing time zone updates are the least of your problems and you should be caring about security updates (e.g. log4j) a lot more.

Also, time zone changes usually don’t happen over night. Time zone updates can be created months in advance, so there’s enough time to update all machines.

A) yes some machines run without app updates for a long time, offline machines and also for this to cause no pain at all everything has to be update instantly all at the same time. B) This also happens if you update in time, because there's not limit, that disallows you to create ZonedDateTimes months and years ahead. C) Edge cases like this (will) have caused real consequences two times now (see year 2000, less of a pain than we thought, but still caused headaches and some failures; upcoming is the Unix epoch debacle, we've know about this decades ahead, but there's probably still some systems that will fail), we should learn from that and maybe plan now an do it properly now and not do something again that "probably will work fine".

cromefire commented 2 years ago

If I do the toInstant conversion by hand then how is that different from the ZonedDateTime class doing the conversion automatically? The problem stays the same.

Yes but you opt-in to that conversion. Similar how you can convert an Instant to a LocalDateTime and discard some information.

Anyway, I wasn’t even talking about comparison/arithmetic over ZonedDateTime and Instant but between two ZonedDateTime objects.

Well ZonedDateTimes may not even refer to an unambiguous point of time. See the time change. If you need to refer to an exact point of time you need a Instant + TimeZone, which would be different from ZonedDateTime. I think we are kinda mixing some use-cases here, I'll do a bit of research and try to define and distinguish both use-cases, so we can split it into two issues with seperate discussions.

ilya-g commented 2 years ago

Most programming languages, frameworks, ORMs and even OpenAPI/Swagger work with a single value for ZonedDateTime. We won't change the official OpenAPI specification, the official Email format, the official HTTP header formats and instruct the whole world to switch to a different structure. Practically every standard uses ZonedDateTime, so it should be easy to work with that kind of data.

@wkornewald Are you sure these are ZonedDateTimes? AFAIK, Email format headers and HTTP headers do not store a timezone identifier, but rather only an offset from UTC. So this makes them OffsetDateTimes which is what was originally requested in this issue. Could you show examples of ZonedDateTime in OpenAPI, ORMs, frameworks?

cromefire commented 2 years ago

Here are the two use-cases, I also mixed those, so we should probably split it to clear it up:

OffsetDateTime(?) (this issue)

Standards

Characteristics

LocalDateTime (offset from UTC)/Instant with a fixed Offset, no time zone identifier allowed (which would be thing that could change, offsets can't).

Good at

Bad at

Problems

ZonedDateTime

Standards

Probably exist, I don't know any off my head

Characteristics

LocalDateTime with a named TimeZone (these can change) (or LocalDateTime (UTC)/Instant with a fixed Offset?).

Good at

Bad at

Problems

wkornewald commented 2 years ago

@ilya-g I always tried to explicitly say that the ZonedDateTime would handle offset and ID dynamically based on the amount of information available. And depending on how precise we want to represent time we could also still consider even a combination of both offset and ID at the same time.

Basically, what I’d like to see is an object that can precisely identify the local and global time at the same time and adapt to the precision that is available.

ilya-g commented 2 years ago

@cromefire To add to above, OffsetDateTime, bad at: