FirebirdSQL / firebird

Firebird server, client and tools
https://www.firebirdsql.org/
1.23k stars 212 forks source link

Support ordered-set (WITHIN GROUP) aggregates. #7632

Open sim1984 opened 1 year ago

sim1984 commented 1 year ago

Need to add support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). It is good to added mode() though it is not in the spec. In addition, you can gracefully solve the problem of element unordering in the LIST aggregate function (in the SQL standard LISTAGG).

Ordered-Set Aggregate Functions

MODE() WITHIN GROUP ( ORDER BY <order_expr>)
PERCENTILE_CONT( <expr> ) WITHIN GROUP ( ORDER BY <order_expr> )
PERCENTILE_DISC( <expr> ) WITHIN GROUP ( ORDER BY <order_expr>)
LIST(<expr>, <separator>) WITHIN GROUP(ORDER BY <order_expr>)

It is also possible to introduce the LISTAGG function, and make LIST an alias for compatibility with the standard.

Hypothetical-Set Aggregate Functions

RANK( <args> ) WITHIN GROUP ( ORDER BY <order_exprs> )
DENSE_RANK( <args> ) WITHIN GROUP ( ORDER BY <order_exprs> )
PERCENT_RANK( <args> ) WITHIN GROUP ( ORDER BY <order_exprs> )
CUME_DIST( <expr> ) WITHIN GROUP ( ORDER BY <order_exprs> )

You can read more about how this is done in PostgreSQL here. https://www.postgresql.org/docs/current/functions-aggregate.html https://paquier.xyz/postgresql-2/postgres-9-4-feature-highlight-within-group/

dyemanov commented 8 months ago

If LISTAGG is added for standard compatibility, should legacy LIST also support ordering? If yes, should it follow the standard syntax (so that LIST and LISTAGG just become synonyms) or, being non-standard, may use its own syntax for ordering, e.g. LIST(FIELD1 ORDER BY FIELD2)?

sim1984 commented 8 months ago

I think it's enough to make it just a synonym. But if they make their own version to simplify, then do not object.

dyemanov commented 8 months ago

The standard LISTAGG also has the separator string being mandatory while our LIST implementation has it optional, with a comma implied if omitted. And the standard LISTAGG defines the separator as a literal only, while we allow arbitrary value expressions. Would it be OK to preserve these differences for the LISTAGG too?

sim1984 commented 8 months ago

Limiting the delimiter to only literal values is not very correct in my opinion. But sorting for LISTAGG must be made mandatory. In principle, the separator can be made mandatory (there are doubts here). Does anyone have other opinions?

dyemanov commented 8 months ago

Sorting may be mandatory in LISTAGG (as per SQL standard), but it must be optional (or missing completely) in LIST due to backward compatibility.

dyemanov commented 8 months ago

Oracle supports LISTAGG in mostly standard way (without <listagg overflow clause>), but it has optional separator. So maybe we should preserve this syntax.

sim1984 commented 8 months ago

Agree. In this case, we have full compatibility with the current LIST function, with the exception of the mandatory sorting. For LIST, sorting can be left optional for backward compatibility reasons. Please note in the documentation that in new projects it is recommended to use the standard LISTAGG.