arrow-py / arrow

🏹 Better dates & times for Python
https://arrow.readthedocs.io
Apache License 2.0
8.64k stars 667 forks source link

`span_range` produces questionable output following DST time change #1097

Open mbartenschlag opened 2 years ago

mbartenschlag commented 2 years ago

Issue Description

I believe the span_range function produces erroneous output stemming from the time change to DST in the US.

In the example below at ndx=1, the upper bound for the span is <Arrow [2022-03-13T03:59:59.999999-04:00]> when I believe it should be <Arrow [2022-03-13T01:59:59.999999-05:00]>.

>>> for ndx, x in enumerate(arrow.Arrow.span_range(frame='hour', start=arrow.get('2022-03-13T00:00:00-05:00', tzinfo='US/Eastern'), end=arrow.get('2022-03-14T00:00:00-04:00', tzinfo='US/Eastern'), tz='US/Eastern')):
...   print(f"{ndx}: {x}")
...
0: (<Arrow [2022-03-13T00:00:00-05:00]>, <Arrow [2022-03-13T00:59:59.999999-05:00]>)
1: (<Arrow [2022-03-13T01:00:00-05:00]>, <Arrow [2022-03-13T03:59:59.999999-04:00]>)
2: (<Arrow [2022-03-13T03:00:00-04:00]>, <Arrow [2022-03-13T03:59:59.999999-04:00]>)
3: (<Arrow [2022-03-13T04:00:00-04:00]>, <Arrow [2022-03-13T04:59:59.999999-04:00]>)

System Info

mbartenschlag commented 2 years ago

I've simplified the test case. The assertion below fails:

import arrow
shift_back = arrow.get(2022, 3, 13, 3, 0, tzinfo='US/Eastern').shift(microseconds=-1)
shift_ahead = arrow.get(2022, 3, 13, 3, 0, tzinfo='US/Eastern').shift(microseconds=1)
assert shift_back < shift_ahead

Lines 1040-1043 in arrow/arrow.py reveals what's going on under the hood:

        current = self._datetime + relativedelta(**relative_kwargs)

        if not dateutil_tz.datetime_exists(current):
            current = dateutil_tz.resolve_imaginary(current)

For the shift_back case, the dateutil_tz.datetime_exists condition fails, so it gets overwritten with the result of dateutil_tz.resolve_imaginary . Reading up on the implementation of resolve_imaginary, and quoting @pganssle in the ticket where this method was introduced:

I'm fairly certain that an imaginary date you always want to "slide forward" - it will never be that you accidentally wanted to go back.

Unfortunately for us - we do want to slide back in this case. This bug and behavior can be reproduced using datetime + dateutil too - the assertion below will fail.

from datetime import datetime, timedelta
from dateutil import tz
US_EASTERN = tz.gettz('US/Eastern')
start = datetime(2022, 3, 13, 3, 0, tzinfo=US_EASTERN)
step_back = start - timedelta(seconds=1)
step_ahead = start + timedelta(seconds=1)
if not tz.datetime_exists(step_back):
  print(f"Resolving imaginary datetime: {step_back}.")
  step_back = tz.resolve_imaginary(step_back)

assert step_back < step_ahead

I'll stew on this a bit more, but I figured I could at least pass on the source of this bug as I understand it.

systemcatch commented 2 years ago

Hi @mbartenschlag thank you for the detailed bug report and diving into the code.

While this does give a slightly strange result I believe we are following datetime std library logic here.