datafusion-contrib / datafusion-federation

Allow DataFusion to resolve queries across remote query engines while pushing down as much compute as possible down.
Apache License 2.0
37 stars 12 forks source link

How do Federated Node contain "Predicate pushdown" ? #23

Open oikomi opened 4 months ago

oikomi commented 4 months ago

hello all

I have test a sql like SELECT t.TrackId, t.Name AS TrackName, a.Title AS AlbumTitle, ar.Name AS ArtistName FROM Track t JOIN Album a ON t.AlbumId = a.AlbumId JOIN Artist ar ON a.ArtistId = ar.ArtistId WHERE t.TrackId > 375 order by t.TrackId limit 10

I have get logic plan

image

where t.trackid > Int64 does not pushdown?

How to do this optimization?

backkem commented 4 months ago

You are right. To improve on these cases, the plan will have to be re-written prior to cutting it into sub-plans for federation. I've yet to look into this.

To avoid full table reads on both sides of the join we'd need apache/arrow-datafusion#7955.