lebrice / SimpleParsing

Simple, Elegant, Typed Argument Parsing with argparse
MIT License
401 stars 50 forks source link

date, datetime support #83

Closed mixilchenko closed 2 years ago

mixilchenko commented 2 years ago

How to use dates with simple parsing

Since python builtins dt.date and dt.datetime do not support string parsing in constructor, the only way to use dates now is pd.Timestamp class

from dataclasses import dataclass
import pandas as pd
from simple_parsing import ArgumentParser

@dataclass
class A:
    date: pd.Timestamp

parser = ArgumentParser()
parser.add_arguments(A, 'a')
assert parser.parse_args(['--date', '20210924']).a == A(pd.Timestamp('20210924'))

But someone doesn't want to install pandas as dependency and dates support could be helpful

Datetimes can be specified in different formats:

Some problems and possible solutions

  1. dt.date has no strptime method

postprocessor should check that type is not dt.datetime subclass and do something like

dt.datetime.strptime(raw_data, metadata.get('fmt', '%Y-%m-%d)).date()
  1. Datetime is often used with timezones and strptime doesn't handle them correctly. Also timezones have different implementations and which one to use during postprocessing

    >>> import datetime as dt
    >>> import zoneinfo  # see python3.9 or zone info backport
    >>> dt.datetime(2021, 9, 24, 5, 0, 0, tzinfo=zoneinfo.ZoneInfo('Europe/Moscow')).isoformat()
    '2021-09-24T05:00:00+03:00'
    >>> dt.datetime.fromisoformat('2021-09-24T05:00:00+03:00')
    datetime.datetime(2021, 9, 24, 5, 0, tzinfo=datetime.timezone(dt.datedelta(seconds=10800)))

    You can see that named timezone and timezone with shift are not the same things since there are lots of regions with daylight saving time. This problem can be solved with custom postprocessor (see below)

  2. There are lots of dt.date and dt.datetime subclasses (eg pendulum and arrow)

pendulum and arrow libraries have their parse and get methods respectively to parse strings. We can implement optional argument postprocessor of type Optional[Callable[[str], T]] for field. This would provide vast range of possibilities for simple_parsing users. The basic usage can be

@dataclass
class A:
    date: dt.date = field(postprocessor=lambda x: dt.datetime.strptime(x, '%Y%m%d').date())
    pdatetime: pendulum.DateTime = field(postprocessor=pendulum.parse)
mixilchenko commented 2 years ago

@lebrice could you please take a look. What do you think about custom postprocessors?

lebrice commented 2 years ago

Hey @mixilchenko , why not just use the type argument to field? Doesn't that already do what you want?

from dataclasses import dataclass

def date_from_string(v: str) -> dt.date:
    ... # whatever conversion you want to do

from simple_parsing.helpers import field

class Foo:
    before: pd.date = field(type=date_from_string)
mixilchenko commented 2 years ago

Thanks @lebrice! I didn't find this. I think it can close some of my needs.

But I've found at least one con in this approach It writes on --help

--before date_from_string

or

--before <lambda>

instead of real type

--before datetime.date
lebrice commented 2 years ago

@mixilchenko then you can pass in the metavar you want, the same way!