ariebovenberg / whenever

⏰ Modern datetime library for Python
https://whenever.rtfd.io
MIT License
874 stars 14 forks source link

Handle invalidation of zoned datetimes due to changes to timezone definition #39

Open ariebovenberg opened 9 months ago

ariebovenberg commented 9 months ago

How to handle this

i.e. imagine you store ZonedDateTime(2030, 3, 31, hour=1, tz='America/New_York') which we expect to exist at this moment. However, by the time 2030 rolls around NYC has decided to implement summer time at this exact time, making the datetime invalid. How to handle this?

Note that timezone changes during the runtime of the program will likely never be handled. This would be terrible to implement, and I doubt there is a use case for this.

However, unpickling and from_canonical_str() will be affected. Perhaps a similar approach to JS temporal can be used.

bxparks commented 8 months ago

We don't even have to wait until 2030. Muslim countries observing Ramadan can change their UTC offsets 4 times a year, with the exact DST transition date/time depending on the phase of the moon and the judgement call of a human observer. The actual transition times can vary by a few days every year, at the last minute.

The problem is that most ZonedDateTime implementations do not capture the intent of the programmer. The programmer could have intended to capture the date-time components as the invariants, or they could have intended to capture the epochSeconds as the invariant. Java.time, NodaTime, C++ chrono::date, etc. capture the date-time fields. Golang's time package captures the epochSeconds. Neither is correct all the time, although I think capturing the date-time fields along with the UTC offset allows the epochSeconds to be regenerated, so I think those libraries can be more correct than Golang.

Another implication of this is that most libraries try very hard to avoid constructing a ZonedDateTime object that is invalid, throwing an exception for example. This issue shows that for future dates, a fully constructed instance of ZonedDateTime can be invalid no matter how hard the library tries. Some libraries will allow an "invalid" ZonedDateTime instance to be created, then rely on the programmer to call something equivalent of ZonedDateTime.normalize() method to renormalize the instance. I guess that's what the Temporal.ZonedDateTime.from() method does. (I'm guessing that in Temporal, all objects are immutable, so must be recreated.) I think technically, this approach is more correct, but potentially terrible for ergonomics because most programmers will not remember to call normalize() or ZonedDateTime.from() when it can make a difference.

I don't have a perfect solution to offer here. Temporal's solution seems pretty good at first glance, with the caveat that it's another complex edge case that needs to be explained, and most end-users will have trouble understanding.

ariebovenberg commented 8 months ago

It's indeed a tricky case in the category "the vast majority of users won't encounter this, but the API needs to be rock-solid in case it does happen".

My current thinking is:

concrete solution

adding a offset_conflict= parameter with options "raise" (default) | "preserve_instant" | "preserve_local_time".

use-case 1: pickling

zdt = ZonedDateTime(2020, 1, 1, tz="Europe/Amsterdam", unpickle_offset_conflict="preserve_instant")
pkl = pickle.dumps(zdt)  # stores the tz offset behavior

# Amsterdam changes its tz here...

new_zdt = pickle.loads(pkl)  # handled according to `preserve_instant`, as explicitly configured

use-case 2: updating ZoneInfo cache

zdt = ZonedDateTime(2020, 1, 1, tz="Europe/Amsterdam")

# Amsterdam changes its tz here

zoneinfo.clear_cache()

# explicit update with clearly chosen strategy
zdt_updated = zdt.reload_zoneinfo(offset_conflict="preserve_local_time")

use-case 3: loading from text

zdt = ZonedDateTime.from_canonical_format(
    "2020-01-01 00:00:00+01:00 [Europe/Amsterdam]",
    offset_conflict="preserve_instant"
)

what do you think?