Open jimbaker opened 1 year ago
What other ideas might you have for this, beyond the kinds of regular DSL points made in the tutorial and PEP?
What other ideas might you have for this, beyond the kinds of regular DSL points made in the tutorial and PEP?
In the tutorial, we have this specific match:
case getvalue, raw, _, _:
match value := getvalue():
case SQL() | Identifier():
parts.append(value)
case _:
parts.append(Param(raw, value))
which results in writing statements like so:
cur.execute(*sql'create table {Identifier(table_name)} (name, first_appeared)')
because we cannot disambiguate what table_name
is without an AST parser, given the complexity of SQL.
Having said, there are counterexamples where it's still necessary, such as the where clause on an expression - is it an identifier or a value? One could do something like
where "{identifier}" = {value}
(remembering the supposed distinction between double quotes and single quotes in SQL, but see https://sqlite.org/quirks.html; SQL Server does something different, of course).
In this case an explicit Identifier(expr)
can be useful.
Compare this to HTML, where there's perhaps a more natural mapping, say a dict
here means to have this interpolation in style={some_value}
. But even there, likely gotchas to chose the correct interpolation.
Interestingly the popular ETL tool dbt uses Jinja to templatize SQL. It also manages one's SQL queries and corresponding models, and allows them to be composed together more easily. All managed with git, at least with an add-on.
One take: https://codingisforlosers.com/wtf-is-dbt/ (amusing name for a web site, still some good content on usage).
There also seems to be a fair amount of overlap between the dbt approach and what we have discussing for FDOM.
While I had heard of dbt before, I had not actually looked it until today, after scrolling through my X/Twitter feed and seeing both templated SQL and tool support for it: https://twitter.com/imrobertyi/status/1691822118411907480
That line bring a software-engineering-style discipline to your data analysis work
really, really hit the mark for me. What a great phrasing.
https://github.com/tobymao/sqlglot supports parsing SQL, in various dialects, into a usable AST:
So this should suffice for capturing placeholders (note
$xNx
, this allows us to keep track of the placeholder number N), then provide a correct interpretation as we are currently doing with marker types in https://github.com/jimbaker/tagstr/blob/main/src/tagstr/sql.py, but with a compiler. This means being able to writeand the parser identifies that
{table_name}
is an identifier, after the placeholder substitution: