InseeFr / Trevas

Transformation engine and validator for statistics.
MIT License
9 stars 5 forks source link

Implementation of temporal type and operators #316

Open hadrienk opened 3 months ago

hadrienk commented 3 months ago

Type system:

Notes: The manual points out that the time is the root time, and represents time interval (start, end). It should support shift, mutation of start / end values, split and so on.

The date is defined as a time with start = end.

The time period is a "non-overlapping time interval" with a "regular duration". The regular duration is only relevant for months (varying number of days) and days (timeshifts). It seems the intent is to represent duration in time, such as 2018W1 (first week of 2018) or 2020Q1 (first quarter of 2020).

hadrienk commented 3 months ago

Using the threeten-extra (additional date-time classes that complement those in Java SE 8.) the mapping can be as follow:

date -> java.time.Instant time_period -> Interval duration -> PeriodDuration

This simplify the parsing of ISO_8601 format.

hadrienk commented 3 months ago

Test case:

d1 := cast("P1Y2M10DT2H30M", duration);
d2 := cast("P1Y15M2DT086401S", duration);
p1 := cast("2015-03-03T09:30:45Z/2018-04-05T12:30:15Z", time_period);
p2 := cast("2007-03-01T13:00:00Z/P1Y2M10DT2H30M", time_period);
p3 := cast("P1Y2M10DT2H30M/2008-05-11T15:30:00Z", time_period);

The truncated representation of period is left unimplemented as it seems to be deprecated

p3 := cast("P1Y2M10DT2H30M/2008-05-11T15:30:00Z", time_period);
hadrienk commented 2 months ago

The temporal functions are defined in the grammar as follow:

timeOperators:
    PERIOD_INDICATOR LPAREN expr? RPAREN                                                                                                # periodAtom
    | FILL_TIME_SERIES LPAREN expr (COMMA (SINGLE|ALL))? RPAREN                                                                         # fillTimeAtom
    | op=(FLOW_TO_STOCK | STOCK_TO_FLOW) LPAREN expr RPAREN                                                                             # flowAtom
    | TIMESHIFT LPAREN expr COMMA signedInteger RPAREN                                                                                  # timeShiftAtom
    | TIME_AGG LPAREN periodIndTo=STRING_CONSTANT (COMMA periodIndFrom=(STRING_CONSTANT| OPTIONAL ))? (COMMA op=optionalExpr)? (COMMA (FIRST|LAST))? RPAREN     # timeAggAtom
    | CURRENT_DATE LPAREN RPAREN

Turning normal function invocations into special case with untyped parameters.

hadrienk commented 2 months ago

In order to handle time_agg, the group all needs to be implemented. The group all only supports one expression. This is surprising as one might want to use other expressions to adjust the groups (contat, split, math, etc)

hadrienk commented 2 months ago

In order to handle time_agg, the group all needs to be implemented. The group all only supports one expression. This is surprising as one might want to use other expressions to adjust the groups (contat, split, math, etc)

Even worse, what should be the name of the new column that the group all uses? Using convention here requires evaluating the expression itself.