gweis / isodate

ISO 8601 date/time parser
BSD 3-Clause "New" or "Revised" License
148 stars 58 forks source link

Allow control over return type of parse_duration #64

Closed Flix6x closed 2 years ago

Flix6x commented 3 years ago

My main suggested edit here is to allow control over the return type of parse_duration (implemented as a non-breaking revision). I explain below why I feel this is important. I also fixed a typo and changed an if-else block to its logical equivalent, so it is clear that it follows the same reasoning as another if-else block (line 218 onwards).

Why do I feel control over the return type of parse_duration is important?

The underlying issue at play here is the interpretation of the nominal duration "P1D" as an absolute duration datetime.timedelta(days=1). That is, a datetime.timedelta(days=1) is defined as exactly 24 hours, whereas ISO 8601 defines "P1D" as a calendar day, whose exact duration depends on its positioning in a calendar. For example, for the Europe/Amsterdam timezone (which happens to be my timezone), daylight saving time will start on March 28th 2021, which means March 28th in my calendar has 23 hours.

If you'd like, I can make a ticket for this, too, but I'd like to restrict this PR to merely allowing control over the return type. By setting as_timedelta_if_possible=False, the isodate.Duration could be used to determine the exact duration of nominal durations by their position in a calendar. For example, using pandas.DateOffset (which is based on dateutil.relativedelta.relativedelta):

import datetime
import isodate
import pandas as pd
import pytz

duration_str = "P1D"
timezone_str = "Europe/Amsterdam"
nominal_duration = isodate.parse_duration(duration_str, as_timedelta_if_possible=False)

def exact_duration_given_position_in_calendar(nominal_duration: isodate.Duration, start: datetime.datetime) -> datetime.timedelta:
    """Determine the exact duration of a nominal isodate.Duration object, given its starting position in a calendar."""
    years = nominal_duration.years
    months = nominal_duration.months
    days = nominal_duration.days
    seconds = nominal_duration.tdelta.seconds
    offset = pd.DateOffset(
        years=years, months=months, days=days, seconds=seconds
    )
    return (pd.Timestamp(start) + offset).to_pydatetime() - start

dates = [
    [2021, 3, 27],  # 1 day before
    [2021, 3, 28],  # day of transition to Daylight Saving Time
    [2021, 3, 29],  # 1 day after
]
for date in dates:
    start = pytz.timezone(timezone_str).localize(
        datetime.datetime(*date)
    )
    print(f"If start positioned at {start} in timezone {timezone_str}, P1D takes {exact_duration_given_position_in_calendar(nominal_duration, start)}.")

# If start positioned at 2021-03-27 00:00:00+01:00 in timezone Europe/Amsterdam, P1D takes 1 day, 0:00:00.
# If start positioned at 2021-03-28 00:00:00+01:00 in timezone Europe/Amsterdam, P1D takes 23:00:00.
# If start positioned at 2021-03-29 00:00:00+02:00 in timezone Europe/Amsterdam, P1D takes 1 day, 0:00:00.
coveralls commented 3 years ago

Coverage Status

Coverage remained the same at 94.433% when pulling 43b894e61ca5c48b26124d284296779f85e44873 on SeitaBV:master into 27cebc56a0f6a900eeb897512879d545095f410b on gweis:master.

gweis commented 2 years ago

thanks

Flix6x commented 8 months ago

Would you have time to create a release containing this feature? The latest release was on the same date as this PR was merged, but it didn't make the cut.