apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.16k stars 1.16k forks source link

SQL Logical Plan Drops ordering within UDAFs #7531

Open jacksonrnewhouse opened 1 year ago

jacksonrnewhouse commented 1 year ago

Describe the bug

SQL supports ordering within an aggregate function, e.g. SUM(a order by b desc). DataFusion's logical planner does correctly identify this for built-in aggregates, e.g. sum(), max(), min(). However, for UDAFs, it seems to be dropped.

I've written up an example here: https://github.com/ArroyoSystems/arrow-datafusion/blob/92e92b9e971bc407b39de2ce8c9bc793355168f7/datafusion-examples/examples/simple_udaf.rs#L174.

The output is

built-in aggregate Logical plan, has ORDER BY:
Projection: SUM(t.a) ORDER BY [t.a DESC NULLS FIRST]
  Aggregate: groupBy=[[]], aggr=[[SUM(t.a) ORDER BY [t.a DESC NULLS FIRST]]]
    TableScan: t
UDAF Logical plan, missing ORDER BY:
Projection: geo_mean(t.a)
  Aggregate: groupBy=[[]], aggr=[[geo_mean(t.a)]]
    TableScan: t

To Reproduce

Register a UDAF, write a query with ordering inside it, and inspect the logical plan.

Expected behavior

The UDAF expression should have a non-empty order_by clause.

Additional context

No response

alamb commented 1 year ago

Thank you for the report @jacksonrnewhouse -- I hope to investigate more fully at some point. Any additional debugging insight you have would be most helpful

jacksonrnewhouse commented 1 year ago

Okay, I think I've found where this is happening and come up with a patch at https://github.com/ArroyoSystems/arrow-datafusion/pull/new/bugfix_7531. Where should I be writing a test for this?

alamb commented 1 year ago

Nice find @jacksonrnewhouse !

Where should I be writing a test for this?

I would suggest somewhere in https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/tests/user_defined/user_defined_aggregates.rs

alamb commented 2 months ago

I think we may be close to / done fixing this as part of https://github.com/apache/datafusion/issues/8708