partiql / partiql-tests

Test data for checking conformance to the PartiQL specification.
Apache License 2.0
7 stars 1 forks source link

Clarify `EXTRACT` semantics in tests #83

Open alancai98 opened 1 year ago

alancai98 commented 1 year ago

Need to clarify some of the EXTRACT tests ported from partiql-lang-kotlin related to:

  1. Output type for extracted date/time part values.
    • partiql-lang-kotlin uses decimals
  2. Default values for extracting date parts from time values (e.g. EXTRACT(YEAR FROM TIME '12:34')) -- NULL, MISSING, or some default
    • partiql-lang-kotlin defaults to erroring in legacy mode, missing in permissive
  3. Default values for extracting time parts from date parts (e.g. EXTRACT(HOUR FROM DATE '2020-01-01')) -- NULL, MISSING, or some default
    • partiql-lang-kotlin defaults to 0. in both typing modes

This may also warrant a spec issue.

alancai98 commented 1 year ago

Regarding 1., Output type for extracted date/time part values the SQL specs have varied descriptions for what type is to be used for extracted values.


SQL:2011 states:

The declared type of <extract expression> is an implementation-defined exact numeric type. If <primary
datetime field> specifies SECOND, then the scale is implementation-defined; otherwise, the scale is
0 (zero).

where an exact numeric type is defined as follows:

The data types NUMERIC, DECIMAL, SMALLINT, INTEGER, and BIGINT are collectively referred
to as exact numeric types.

So from SQL:2011, it appears to be implementation-defined.


SQL:1992 states:

<extract expression> (see Subclause 4.5.3, "Operations involving
            datetimes and intervals") operates on a datetime or interval
            argument and returns an integer.

but later states:

<extract expression> operates on a datetime or interval and returns
         an exact numeric value representing the value of one component of
         the datetime or interval.

and

If <extract expression> is specified, then

            Case:

            a) If <datetime field> does not specify SECOND, then the data
              type of the result is exact numeric with implementation-
              defined precision and scale 0.

            b) Otherwise, the data type of the result is exact numeric
              with implementation-defined precision and scale. The
              implementation-defined scale shall not be less than the spec-
              ified or implied <time fractional seconds precision> or <in-
              terval fractional seconds precision>, as appropriate, of the
              SECOND <datetime field> of the <extract source>.

I'm not sure if PartiQL should define the output type on its own or define it to be an implementation detail. To get around this, we can define the EXTRACT tests to assert on the result in the statement using the PartiQL = operator.

For example a test as follows:

{
  ...
  statement:"EXTRACT(YEAR FROM DATE '2023-04-13')",
  assert:{
    ...
    output:2023
  }
},

could be rewritten to:

{
  ...
  statement:"EXTRACT(YEAR FROM DATE '2023-04-13') = 2023",
  assert:{
    ...
    output:true
  }
},

This is test rewrite is captured in: https://github.com/partiql/partiql-tests/pull/84.

alancai98 commented 1 year ago

Regarding 2, the SQL specs don't discuss EXTRACT with a date part on a time value. Intuitively it may make sense to give an error in strict mode and MISSING in permissive mode. The strict mode behavior is what other databases seem to do (e.g. postgresql, mysql). For now, these test will be included in the conformance tests.

Regarding 3, the SQL specs don't discuss EXTRACT with a time part on a date value. We could give a default time (or implicitly cast the date to a timestamp). This behavior doesn't seems as clear as some databases give an error while others assume the default. For now, these tests will be omitted from the conformance tests.