Open alexwilcoxson-rel opened 1 year ago
I think it makes sense to let UDAF support multiple column inputs, there are already built-in aggregate functions like correlation/covariance that support multi-column input.
I believe this was fixed in https://github.com/apache/arrow-datafusion/pull/7096
Can you confirm @alexwilcoxson-rel ?
@alamb this fixes our initial use case of just needing to provide multiple inputs. There still looks to be an issue with the latest code on main where you can't create a struct and pass it as a single argument to a UDAF, e.g. SELECT my_udaf(struct(col_A, col_B))
This was just our workaround though and is more of an edge case IMO.
So perhaps just keep it with the other "improve struct" issues.
This was just our workaround though and is more of an edge case IMO.
So perhaps just keep it with the other "improve struct" issues.
Yes that makes sense to me -- it is probably worth making a new ticket for just that usecase (especially since this ticket has such a nice reproducer) ❤️
Describe the bug
We have a use case to provide multiple column values to a UDAF. UDAFs support one column input (unless I'm mistaken, I'm looking at this supporting one input data type.this has been resolved by #7096To work around this we tried packing the columns into a struct column and passing that as input into the UDAF but we're seeing an error with both SQL API
struct()
builtin and the Expr APIBuiltInScalarFunction::Struct
To Reproduce
run the tests below and see following output
Failures
Table
cargo test -- --nocapture
shows the tableTests
Expected behavior
We are able to create a struct with multiple fields using SQL API
struct()
builtin or Expr API'sBuiltInScalarFunction::Struct
and provide that as input to UDAF.Additional context
The UDAF here is very simple just for example.
Is there a limitation with UDAF or could we open an enhancement request to support multiple input columns?