Closed msarrel closed 3 years ago
Hi @msarrel thanks for the bug report.
arrow has now implemented PEP 495 for all tzinfos that it uses. This allows us to work with ambiguous (same clock, different offset) datetimes.
With your example;
(arrow) chris@ThinkPad:~/arrow$ python
Python 3.8.3 (default, Jul 7 2020, 18:57:36)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import arrow
>>> arw=arrow.get("2001-10-28 01:00:00", "YYYY-MM-DD HH:mm:ss", tzinfo="US/Pacific")
>>> arw
<Arrow [2001-10-28T01:00:00-07:00]>
>>> arw2=arw.replace(fold=1)
>>> arw2
<Arrow [2001-10-28T01:00:00-08:00]>
>>> arw==arw2
True
>>> arw2.to("utc")
<Arrow [2001-10-28T09:00:00+00:00]>
So the result for your range method is correct (mostly, depends on dateutil), however it would be nice to be able to pass fold
as a kwarg to arrow.get()
.
Thank you for telling me about the fold
option. That is useful. But, I'm still not completely convinced on the correctness of the range()
result. I think that range()
should return a list of 25 values for 2001-10-28.
import arrow
short = list(arrow.Arrow.range(
"hour",
arrow.get("2001-04-01 00:00:00", "YYYY-MM-DD HH:mm:ss", tzinfo="US/Pacific"),
arrow.get("2001-04-01 23:00:00", "YYYY-MM-DD HH:mm:ss", tzinfo="US/Pacific")))
print("short (should be 23)")
print(len(short))
normal = list(arrow.Arrow.range(
"hour",
arrow.get("2001-04-02 00:00:00", "YYYY-MM-DD HH:mm:ss", tzinfo="US/Pacific"),
arrow.get("2001-04-02 23:00:00", "YYYY-MM-DD HH:mm:ss", tzinfo="US/Pacific")))
print("normal (should be 24)")
print(len(normal))
long = list(arrow.Arrow.range(
"hour",
arrow.get("2001-10-28 00:00:00", "YYYY-MM-DD HH:mm:ss", tzinfo="US/Pacific"),
arrow.get("2001-10-28 23:00:00", "YYYY-MM-DD HH:mm:ss", tzinfo="US/Pacific")))
print("long (should be 25)")
print(len(long))
Currently, range()
works correctly in two of the three cases. That is for short (23 hour) and normal (24 hour) days. I'd like it to work correctly for long (25 hour) days as well.
short (should be 23)
23
normal (should be 24)
24
long (should be 25)
24
Another way to illustrate is this code:
long_utc = list(arrow.Arrow.range(
"hour",
arrow.get("2001-10-28 00:00:00", "YYYY-MM-DD HH:mm:ss", tzinfo="US/Pacific").to("utc"),
arrow.get("2001-10-28 23:00:00", "YYYY-MM-DD HH:mm:ss", tzinfo="US/Pacific").to("utc")))
print("long_utc (should be 25)")
print(len(long_utc))
for t in long_utc:
print(t.to("US/Pacific"), t)
It produces the expected 25 hour result by first converting to UTC, then performing the range and then converting back.
long_utc (should be 25)
25
2001-10-28T00:00:00-07:00 2001-10-28T07:00:00+00:00
2001-10-28T01:00:00-07:00 2001-10-28T08:00:00+00:00
2001-10-28T01:00:00-08:00 2001-10-28T09:00:00+00:00
2001-10-28T02:00:00-08:00 2001-10-28T10:00:00+00:00
2001-10-28T03:00:00-08:00 2001-10-28T11:00:00+00:00
2001-10-28T04:00:00-08:00 2001-10-28T12:00:00+00:00
2001-10-28T05:00:00-08:00 2001-10-28T13:00:00+00:00
2001-10-28T06:00:00-08:00 2001-10-28T14:00:00+00:00
2001-10-28T07:00:00-08:00 2001-10-28T15:00:00+00:00
2001-10-28T08:00:00-08:00 2001-10-28T16:00:00+00:00
2001-10-28T09:00:00-08:00 2001-10-28T17:00:00+00:00
2001-10-28T10:00:00-08:00 2001-10-28T18:00:00+00:00
2001-10-28T11:00:00-08:00 2001-10-28T19:00:00+00:00
2001-10-28T12:00:00-08:00 2001-10-28T20:00:00+00:00
2001-10-28T13:00:00-08:00 2001-10-28T21:00:00+00:00
2001-10-28T14:00:00-08:00 2001-10-28T22:00:00+00:00
2001-10-28T15:00:00-08:00 2001-10-28T23:00:00+00:00
2001-10-28T16:00:00-08:00 2001-10-29T00:00:00+00:00
2001-10-28T17:00:00-08:00 2001-10-29T01:00:00+00:00
2001-10-28T18:00:00-08:00 2001-10-29T02:00:00+00:00
2001-10-28T19:00:00-08:00 2001-10-29T03:00:00+00:00
2001-10-28T20:00:00-08:00 2001-10-29T04:00:00+00:00
2001-10-28T21:00:00-08:00 2001-10-29T05:00:00+00:00
2001-10-28T22:00:00-08:00 2001-10-29T06:00:00+00:00
2001-10-28T23:00:00-08:00 2001-10-29T07:00:00+00:00
And, it shows why it's tricky to do these conversions. In my original comment, I should have written that 2001-10-28T09:00:00+00:00
corresponds to 2001-10-28T01:00:00-08:00
rather than 2001-10-28T02:00:00-07:00
. That was my manual mistake.
Yes there's plenty of room for slip ups with how complex this stuff can get. Given how python represents ambiguous datetimes it's not easy or necessary to implement this change.
However it's worth adding the fold
kwarg to arrow.get()
.
Given that there's been no further discussion and we're not planning on making any changes here I'll close this.
Issue Description
The
range()
method does not work properly on days with 25 hours in them when we switch from daylight savings time to standard time.I would expect this fragment of code to produce a list that has twenty-five elements.
But, the list contains only twenty-four entries. If we look at the first few entries, we can see that
2001-10-28T02:00:00-07:00
, corresponding to2001-10-28T09:00:00+00:00
, is missing.This fragment of code shows the expected result.
And, now we see
2001-10-28T02:00:00-07:00
and2001-10-28T09:00:00+00:00
.My suggested solution would be to convert the beginning and end of the range to UTC, perform the range, and then convert the results back to the original time zone.
There is still a problem in that
arrow.get("2001-10-28 02:00:00", "YYYY-MM-DD HH:mm:ss", tzinfo="US/Pacific").to("utc")
always produces the result<Arrow [2001-10-28T10:00:00+00:00]>
. It could just as legitimately produce<Arrow [2001-10-28T09:00:00+00:00]>
. Not sure what to suggest, but would be good to give the user control over the result in this sort of case. Perhapsrange()
could optionally return a tuple of times in this sort of case, or the user could specify which result is desired.System Info