apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.87k stars 1.11k forks source link

[Epic]: Improved Support for order aware aggregates #8583

Open alamb opened 8 months ago

alamb commented 8 months ago

Is your feature request related to a problem or challenge?

This type of function is important in streaming usecases and the goal is to support multiple order aware normal (as opposed to window) aggregates such as various N'th value, rank functions, etc and timeseries functions. These would be both for built in functions and user defined aggregates.

At the time of writing DataFusion supports three aggregate functions that can be "order aware": ARRAY_AGG, FIRST_VALUE and LAST_VALUE. This means that you can supply a ORDER BY clause to their argument, for example FIRST_VALUE(x ORDER BY time).

This Epic captures plans related to improving support

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

alamb commented 7 months ago

Filed https://github.com/apache/arrow-datafusion/issues/8984 for user defined order aware aggregates