bear / parsedatetime

Parse human-readable date/time strings
Apache License 2.0
695 stars 106 forks source link

Support for quarters and financial terminology #206

Open AgDude opened 7 years ago

AgDude commented 7 years ago

We are needing to support financial periods to our application, so I am looking for parsing of the following types of strings:

beginning of quarter beginning of last quarter beginning of year end of last quarter end of this quarter end of last month beginning of month

note to contributors: I would be more than willing to contribute a PR if you can offer me a little guidance on how you would go about this. I have skimmed the source, but am not intimately familiar with it. If you would like to tell me this is out of scope, that is fine too.

idpaterson commented 7 years ago

There are some good challenges here. I think that the most feasible currently are beginning of year and beginning of month. These could be handled in the locale'sre_sources which allows replacement of datetime components. Beginning is easy because it's always month 1 and/or day 1. End may be more complicated since the days in a month are variable. Quarters are not much more complicated, there just isn't any precedent in the code to direct you toward for implementing them.

Please be sure to update with any other critical financial terminology; this seems like an important use case.

AgDude commented 7 years ago

Thanks @idpaterson for the quick response.

I decided for the time being to implement this outside of pdt. I would like to incorporate it at some point, hopefully I will find the time. In case anyone else wants to take it on, here is what I am using.

re_financial_date = r'\b(?P<day>\w+) +of +\b(?P<modifier>\w+)? *\b(?P<unit>\w+)'
from dateutil.relativedelta import relativedelta

def beginning_of_quarter(date):
    month = int(math.ceil(date.month/3.0) * 3 - 2)
    return datetime.date(date.year, month, 1)

def parse_financial_date(dateString, sourceTime):
    """
    https://github.com/bear/parsedatetime/issues/206
    :param dateString:
    :param sourceTime: date or datetime object
    :return: date object
    """
    inner_modifiers = {
        'begin': 1,
        'start': 1,
        'end': lambda x: x, # will get inner_length
    }

    pdt_constants = PdtConstants()

    re_inner = r'^(\d+|{0}).*'.format('|'.join(inner_modifiers.keys()))

    dateString = dateString.lower()
    groups = re.search(re_financial_date, dateString).groupdict()

    inner_modifier_match = re.search(re_inner, groups.get('day') or '')
    if inner_modifier_match:
        try:
            inner_offset = inner_modifiers[inner_modifier_match.groups()[0]]
        except KeyError:
            inner_offset = int(inner_modifier_match.groups()[0])
    else:
        inner_offset = 0

    outer_offset = pdt_constants.Modifiers.get(groups['modifier'], 0)

    unit = groups.get('unit')

    today = local_date(sourceTime)

    if unit  in pdt_constants.units['months']:
        outer_length = 1 # months
        base = today.replace(day=1) + relativedelta(months=outer_offset * outer_length)
        inner_length = pdt_constants.daysInMonth(base.month, base.year)

    elif unit == 'quarter':
        outer_length = 3
        base = beginning_of_quarter(today) +  relativedelta(months=outer_offset * outer_length)
        inner_length = 0
        for m in range(3):
            inner_length +=  pdt_constants.daysInMonth(base.month + m, base.year)
    elif unit in pdt_constants.units['years']:
        outer_length = 12
        base = today.replace(day=1, month=1) + relativedelta(months=outer_offset * outer_length)
        inner_length = 365
        if base.year in pdt_constants._leapYears:
            inner_length += 1
    else:
        raise ValueError('Could not derive relative date from "%s"' % dateString)

    if callable(inner_offset):
        inner_offset = inner_offset(inner_length)

    return base +  datetime.timedelta(days = inner_offset - 1)
AgDude commented 7 years ago

I thought of a couple more strings I would like to support. The function above already supports #205. I would also like to support things like:

Second month of last quarter 10th of third month of second quarter next year Two quarters ago.