apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.33k stars 1.2k forks source link

EPIC: Implement/investigate other join types #13181

Open Dandandan opened 3 weeks ago

Dandandan commented 3 weeks ago

c### Is your feature request related to a problem or challenge?

From http://btw2017.informatik.uni-stuttgart.de/slidesandpapers/F1-10-37/paper_web.pdf

There is this useful overview of join types used by HyPer:

image

We can investigate the following:

Describe the solution you'd like

Implement these types and use them in the planner to improve performance on TPC-H and TPC-DS queries

Describe alternatives you've considered

No response

Additional context

No response

comphead commented 3 weeks ago

Thanks @Dandandan Now I understand how exotic join types (RightSemi, RightAnti) are coming into play

Lordworms commented 3 weeks ago

I would like to try group join

ngli-me commented 1 week ago

Hi, do you mind if I try taking single join + making an issue for it? I see a description (pg 5), along with some pseudo code (pg 13), hopefully this seems accurate. http://btw2017.informatik.uni-stuttgart.de/slidesandpapers/F1-10-37/paper_web.pdf

I think this, along with your in progress code is a reasonable reference for me to start trying from.