apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.33k stars 1.2k forks source link

Implement group join #13243

Open Dandandan opened 2 weeks ago

Dandandan commented 2 weeks ago

Is your feature request related to a problem or challenge?

From https://www.vldb.org/pvldb/vol4/p843-moerkotte.pdf

A group join will use a single table from the hash join for executing a join followed by a group by on the same columns.

An example query from the paper introduction

select a,count(*)
from R1 left outer join R2 on R1.a = R2.b
where R1.c=5
group by a

Describe the solution you'd like

Implement group join in execution and in the (logical) planner.

Describe alternatives you've considered

No response

Additional context

No response

Dandandan commented 2 weeks ago

@Lordworms

maruschin commented 1 week ago

Is anyone working on this task? I want to take it.

maruschin commented 1 week ago

@Lordworms can i take this?

Lordworms commented 1 week ago

@Lordworms can i take this?

Sure, feel free to go.