substrait-io / substrait-java

Apache License 2.0
75 stars 72 forks source link

Take `sorts` into account inside aggregate functions #279

Open gabotechs opened 3 months ago

gabotechs commented 3 months ago

Aggregate functions can take inner ORDER BY statements that will sort the underlaying data before the aggregation, for example:

SELECT string_agg("foo", ', ' ORDER BY "foo" DESC) FROM "tbl"

In substrait this is reflected as a sorts field inside an aggregate function's measure. This PR adds support for loading that field in ProtoTypeConverter

CLAassistant commented 3 months ago

CLA assistant check
All committers have signed the CLA.

gabotechs commented 3 months ago

I'm unsure about how to contribute tests for this, any guidance there is welcomed

vbarua commented 3 months ago

For testing this, I would suggest adding a sort to the aggregate roundtrip test in https://github.com/substrait-io/substrait-java/blob/3e553eee981feb11a64b6c2fef6daf1fe377945a/core/src/test/java/io/substrait/type/proto/AggregateRoundtripTest.java

to make sure that sorts can be read from protos and output to protos.