partiql / partiql-lang

The PartiQL language specification
https://partiql.org/partiql-lang
Other
10 stars 1 forks source link

Aggregation functions in permissive mode #17

Open vgapeyev opened 1 year ago

vgapeyev commented 1 year ago

Aggregation functions behavior on mistyped cases in permissive mode -- needs to be verified or even specified.

See, e.g. EvaluatingCompiler.builtinAggregates.checkIsNumberType -- currently, in SUM and AVG, a mistyped element in an aggregated collection always causes an error, even in the permissive mode.

alancai98 commented 1 year ago

Followup from discussion with @almann.

Strict typing mode behavior should still error for the SQL aggregates. Permissive typing mode should never error but coerce to missing in the case of data type mismatches.

The COLL_* functions should follow the similar behavior in the case of mistyped elements.

This behavior aligns with the arithmetic operators (+, *, /, etc.) for both typing modes.

alancai98 commented 1 year ago

From this conformance test comment https://github.com/partiql/partiql-tests/pull/80#pullrequestreview-1393035547, we clarified the behavior further for COLL_ functions which implies the SQL aggregation function:

To recap our in-person discussions, the rules we settled on for computing COLL_X( c ) are like this:

  • COLL_X(NULL) is NULL, by the general rule of NULL propagating over a function call.
  • COLL_X(MISSING) is MISSING, by the general rule of MISSING propagating over a function call.
  • COLL_X expects a collection, so if c is not a collection, the result is a type error in strict mode and MISSING in permissive mode.
  • Otherwise, when c is a collection, let c' be its residual with all NULL and MISSING elements removed. Then, c' must contain only elements that are of the appropriate type for function X. (When X is SUM or AVG the appropriate types are numeric, when X is SOME, ANY, EVERY, the type is boolean.) So,

    • If c' contains an element of an inappropriate type, the result is a type error in strict mode and MISSING in permissive mode.
    • If c' only contains elements of appropriate types, the result is computed as expected for X.