apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.26k stars 1.18k forks source link

Implement group join #13243

Open Dandandan opened 4 days ago

Dandandan commented 4 days ago

Is your feature request related to a problem or challenge?

From https://www.vldb.org/pvldb/vol4/p843-moerkotte.pdf

A group join will use a single table from the hash join for executing a join followed by a group by on the same columns.

An example query from the paper introduction

select a,count(*)
from R1 left outer join R2 on R1.a = R2.b
where R1.c=5
group by a

Describe the solution you'd like

Implement group join in execution and in the (logical) planner.

Describe alternatives you've considered

No response

Additional context

No response

Dandandan commented 4 days ago

@Lordworms