quarylabs / quary

Open-source BI for engineers
https://www.quary.dev
Apache License 2.0
2.17k stars 49 forks source link

PRQL-based models #173

Open benfdking opened 4 months ago

benfdking commented 4 months ago

Background

PRQL is a modern language for transforming data. It overcomes lot's of the shortcomings of SQL for pipeline design.

The pipeline nature of the language would make it a lovely fit for refactoring models because you essentially can take subsections.

Proposal

Questions still needing to be answered:

Major:

From q.<model> could be replaced just like we do it for SQL. We could either do it upfront or keep it more uniform after the SQL conversion.

Might have to just not have Jinja-like templating.

Intial proposal could look like .prql file extensions. Would read them just like a model and you could have the definition in the yaml file.

Convert then run through the inference engine.

Just don't use that.

Minor:

Nit:

Abandoned Ideas

Sections

Implementation

UX

kflk commented 4 months ago

PRQL is rust so you could maybe pull it in as a library? See the work done for PRQL DBT integration here: https://github.com/prql/dbt-prql

max-sixty commented 4 months ago

Hi from the PRQL team!

PRQL is rust so you could maybe pull it in as a library?

Yes this indeed would be quite a simple addition given quary is in rust.

(dbt-prql isn't the best example, since that's doing a lot of hackery in python since dbt doesn't have a proper plugin system. The integration in quary would be much easier — just call the compile method from prqlc...)

benfdking commented 4 months ago

Hey @max-sixty,

Yep, we see no reason why it would be difficult to integrate. It should all compile nicely into our Rust binaries/WASM. We're slightly more concerned about the user experience, complexity, maintainability burden "other non-technical aspets".

I've added a few questions in the initial issue that I can try to answer over time and can start answering them.

max-sixty commented 4 months ago

Great, the questions look very reasonable. I won't attempt to address them all but a couple of thoughts:

How would this interact with Jinja-like templating?

You could run the PRQL through a jinja template before passing it to the compiler.

Though if you prefer to get the SQL with the templating, PRQL should pass through Jinja templates — because we originally implemented this for dbt. It's not really used, and there have been suggestions to remove it, but we could keep it if there were demand. (I'd probably want to confirm it's still running well before depending on it, let us know)

What can we do for autocomplete and all of those niceties?

We don't have an LSP at the moment. It's probably a project after the next big project of rewriting the resolver. I'm quite keen to do this, since one of the original benefits of PRQL was that autocomplete could be much better than with SQL

Complexity of versioning systems?

I would think it's fine to just be on latest

Why this rather than Malloy or another language?

We're big fans of Malloy too! One thing to consider is that a big part of Malloy's innovation is to manage the full project, plausibly as a replacement for something like dbt. Whereas PRQL's innovation is predominantly around the relational language, and currently needs a "project manager" like dbt / quary to be run across a large project.

eitsupi commented 4 months ago

Though if you prefer to get the SQL with the templating, PRQL should pass through Jinja templates

IIUC, that feature has been removed. The CLI includes minijinja now but I think text inputation is executed before compile. (PRQL/prql#1722, PRQL/prql#2104)

max-sixty commented 4 months ago

IIUC, that feature has been removed.

OK thanks, sounds like we did remove it. If this is important for quary, we could come up with something...