Closed phillipleblanc closed 2 months ago
This is used in the SQLite Table Provider to perform a non-trivial rewrite of interval expressions to use the SQLite native date()
or datetime()
functions, since SQLite doesn't natively support intervals.
Apologies for commenting in this old thread, but I'm working on learning about DataFusion's unparse
sql generation support and datafusion-federation
. Is there a reason (other than project logistics) that this kind of interval rewrite logic for sqllite should be done in an ast_analyzer
step here in datafusion-table-providers
rather than in the unparse
logic in DataFusion itself?
I'm exploring replacing the sql-generation system we have in VegaFusion with unparse
and datafusion-federation
, so just trying to get a feel for where dialect customization should happen.
Hello @jonmmease, There was a discussion about this in issue #61
@jonmmease maybe there are already table providers in datafusion-contrib/datafusion-table-providers that meet your needs? These also implement federation. That are used if you register the federation optimizer.
Thanks @backkem, yes these table providers look great, and I'm excited to learn more about how federation works!
I'm thinking about eventually adding support for some additional dialects (like Clickhouse, Spark, MSSQL, etc) and wondering if this should primarily be contributed upstream in the unparse
logic in the datafusion-sql
crate, or if I should customize the generic dialect there with the proper quote style and then do semantic transformations (like different names for functions) in an ast_analyzer
implementation.
RE where to land things: it's a good question. Right now I think we have roughly:
Things can evolve but this is the balance that emerged so far. I'm happy to host more plan re-writing tools/logic that is federation related in this repo.
Thanks for that context @backkem, I appreciate it and it's very helpful. As I've thought about it, it does make sense for the semantic customization of dialects to live alongside the table provider implementations for the sake of testing the generated SQL.
Yeah, I still think its not 100% clear where the responsibility of the Unparser dialect ends and where something like the AST rewriter in the specific implementation begins. Like @backkem, I tend to think that the core Unparser should mostly be focused on syntactical differences between the dialects whereas more complex differences in how, e.g., functions work can be implemented by the AST rewriter (as we did for handling intervals in SQLite: https://github.com/datafusion-contrib/datafusion-table-providers/pull/85)
Adds an AST analyzer to the
SQLExecutor
trait to allow modifying the query AST before execution.