ploomber / jupysql

Better SQL in Jupyter. 📊
https://jupysql.ploomber.io
Apache License 2.0
716 stars 76 forks source link

vendor agnostic ggplot code #278

Closed edublancas closed 9 months ago

edublancas commented 1 year ago

We're closer to adding the new ggplot plotting API (https://github.com/ploomber/jupysql/pull/164); however, the SQL queries are hardcoded and they might not work with particular SQL dialects. In #164 we incorporated sqlglot, which allows us to vendor-agnostic SQL queries so the next iteration for the ggplot feature is to use sqlglot to generate the SQL queries.

Note that this is blocked by https://github.com/ploomber/jupysql/issues/277 since we're missing documentation on how to use sqlglot

yafimvo commented 1 year ago

I'm not sure I completely understood

ggplot API draws charts with the histogram and boxplot functions.

These functions are running SQL, where every SQL is executed with this function

conn.execute(query, with_)

this function runs _prepare_query and transpiles the query using sqlglot.

@edublancas

edublancas commented 1 year ago

you're right, we're already using transpile.

the first step is to understand for which databases the boxplot and histogram work and for which ones it doesn't. this is because transpiling isn't a perfect process so we might need to do some manual fixes: e.g., skip the transpiling process and just write another query. another reason could be that a database doesn't support a function that we need for creating the plot e.g., computing percentiles. in such case, there's nothing we can do.

I'm assuming we have some missing tests in our integration testing suite. so first we need to add those tests, and find out where we're failing (once we know, we can mark the tests with xfail)

then, we can decide which ones we fix and which ones we don't. e.g., we should prioritize databases that are heavily used and we're ok not supporting some functionality for some databases.

we should also update the compatibility table: https://jupysql.ploomber.io/en/latest/integrations/compatibility.html#postgresql