pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
27.94k stars 1.71k forks source link

Full support for TPCH 22 SQL Queries #16659

Open djouallah opened 1 month ago

djouallah commented 1 month ago

Description

it will be nice if polars support running all the 22 TPCH SQL Queries unmodified, so far only 1 and 6 seems to be working https://colab.research.google.com/drive/17vaCF-3QSe0bGv0YOCeCWyq7yGpKxRmV#scrollTo=kFQ0KCKYfMCx

alexander-beedie commented 1 month ago

That actually sounds like a good driver for feature prioritisation on the SQL interface (which is by no means complete yet - lots done, lots to do ;) 👌

Also, can I check which version of Polars you tested on? The latest (0.20.31) has quite a number of updates that may help.

Looks like adding support for implicit JOIN syntax would also immediately raise the number of supported queries; I'll investigate.

djouallah commented 1 month ago

actually you are right, I changed the syntax to use joins and now we have 8 working from 22

https://colab.research.google.com/drive/10GUzHfrOSu1GHXoY7stQ22_xITNkmnVs#scrollTo=LjOA0sqjFYRc

image
djouallah commented 1 month ago

I am not too worry about the sql syntax, I am sure it will be added eventually, what will really nice is to have support for ctx.sql("select 42").show() so i don't have to create a separate function just for polars