carbonfact / lea

🏃‍♀️ Minimalist alternative to dbt
Apache License 2.0
202 stars 6 forks source link

Malloy views #18

Open MaxHalford opened 9 months ago

MaxHalford commented 9 months ago

Mallow is an expressive query language for data processing. It would we so cool to have it working alongside regular SQL views and Python views. Technically, this would only be a new MallowView class in the views module. Then, each client could execute the SQL compiled by Malloy.

MaxHalford commented 9 months ago

prql looks interesting too

LeonardoNatale commented 9 months ago

"Malloy currently works with SQL databases BigQuery, Postgres, and querying Parquet and CSV via DuckDB." (source)

Meaning we wouldn't be able interact with a local database file such as jaffle_shop.db

MaxHalford commented 9 months ago

But wouldn't it work if we just used Malloy as a translation layer to generate SQL?

LeonardoNatale commented 9 months ago

But wouldn't it work if we just used Malloy as a translation layer to generate SQL?

After playing a bit with their python package, I couldn't manage to just translate from MALLOY to SQL without having to load a specific source at the time of the conversion.

To be more specific, the method runtime.get_sql works only with an open connection to either BigQuery, Postgres or a Parquet / CSV file. I couldn't just translate the syntax with something like get_sql(query=malloy_query, dialect="duckdb")

But I might be missing something.

MaxHalford commented 9 months ago

Right, I see. Difficult to know if this is by design on Mallory's side or not.

Is this blocking? Don't we almost have a client available/open when we parse views?

LeonardoNatale commented 9 months ago

It might not be blocking for GoogleQuery. But it's blocking for DuckDB MotherDuck as they just support CSV and Parquet files through DuckDB