PRQL / prql

PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
https://prql-lang.org
Apache License 2.0
9.65k stars 208 forks source link

Write a language specification (EBNF syntax) #285

Open dlescos opened 2 years ago

dlescos commented 2 years ago

Write a language specification document in Extended Backus-Naur Form (see the Go document for an example) may help us:

max-sixty commented 2 years ago

I'm definitely open to this.

As we continue iterating on the language, we wouldn't want this to block changes. But if there were a willing contributor, we could create one at a point-in-time, and refresh it asynchronously.

Once the language is more stable, then possibly we could support it and provide a guarantee that it's current.

Does that make sense @dlescos ?

dlescos commented 2 years ago

Sure! And I agree the document should follow the code and not make things to rigid.

qharlie commented 2 years ago

We use Lark for PyPrql which is like eBNF but with a few extras , https://github.com/qorrect/PyPrql/blob/main/pyprql/lang/prql.lark .

The extras are [] means ()? , and a .N after the rule name ( where N is a number ) assigns priorities to the rules if there is a conflict.

chris-pikul commented 2 years ago

In working on a Go implementation I actually started taking notes in the form of a rough RFC-ish markdown document. I would like to convert the syntax to Extended Backus-Naur Form and actually will start doing that now.

My point in bringing this up, is that it may serve as a starting point for more in-depth official specifications for PRQL. I found the "book" to be a little vague on the finer points of how the compiler/parser actually works, or syntax behaviors expected.

Anyways, let me know if this is a good direction and something that is wanted officially. If it is, I can work it out as a more robust version for the language in general.

github.com/chris-pikul/go-prql/SYNTAX-NOTES.md

max-sixty commented 2 years ago

@chris-pikul that looks very pressive, thanks for writing it up.

I'm open to adding this to our docs. I think for the moment the canonical source should be the actual implentation, but over time the canonical source could be this document. In particular, proposing changes (e.g. how inline pipelines should work), could be in this doc, rather than a random issue / a prototype implementation.

It all looks quite accurate. Couple of tiny things on first glance:

vanillajonathan commented 1 year ago

I wrote something of a draft at https://github.com/PRQL/prql/issues/1810#issuecomment-1430215695