ArroyoSystems / arroyo

Distributed stream processing engine in Rust
https://arroyo.dev
Apache License 2.0
3.81k stars 220 forks source link

Re-add support for joins on structs by rewriting expression #664

Closed mwylde closed 5 months ago

mwylde commented 5 months ago

This PR re-adds support for joining on struct fields (for example on a window), working around https://github.com/apache/datafusion/issues/9254.

Ultimately, this is the responsibility of arrow-ord, which should be able to compare structs as it does other datatypes. In lieu of solving the whole problem there, I've opted to work around the issue by rewriting struct eq expressions into an AND of eq expressions on the struct fields.

So for example,

ON A.window = B.window

gets rewritten into

ON A.window.start = B.window.start AND A.window.end = B.window.end

Currently, this rewriting is limited to structs that have a single level of nesting, with plan-time errors for unsupported expressions.