substrait-io / substrait

A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
https://substrait.io
Apache License 2.0
1.11k stars 147 forks source link

Window bound offset field is too restrictive for RANGE mode #617

Open vbarua opened 3 months ago

vbarua commented 3 months ago

The current definitions of Preceding and Following Window Bounds both declare the offset as

int64 offset = 1;

with a note

A strictly positive integer specifying the number of records that the window extends back from the current record. Required.

This is true for ROWS and GROUPS mode (the latter of which is not yet supported in Substrait). However RANGE mode allows more than just integer offsets. As described in the Postgres docs:

In RANGE mode, these options require that the ORDER BY clause specify exactly one column. The offset specifies the maximum difference between the value of that column in the current row and its value in preceding or following rows of the frame. The data type of the offset expression varies depending on the data type of the ordering column. For numeric ordering columns it is typically of the same type as the ordering column, but for datetime ordering columns it is an interval.

The offset field should be reworked to allow fully describing bounds using RANGE mode, which is effectively all numeric types + intervals

Sources: