Open jc-harrison opened 5 years ago
A legit reason to use **kwargs? 😲
I like this idea but I think it would break the hashing of custom query, so we'd need a different solution there.
Expanding on the above - custom query hashing is dependent on being able to normalise the input SQL, which we do by parsing it against the PostgreSQL parser. Obviously that's going to fail if we give it a template string instead of valid SQL, but waiting on the string to be generated is a blocking op, which we don't want to do in the __init__
- at that point, you might as well just generate the sql and pass it in to the existing class.
Maybe one could fill the gaps for hashing purposes with valid-but-meaningless SQL? e.g. sub in (select 1) as another_custom_query
?
We could perhaps fill the gaps with select * from {another_custom_query.fully_qualified_table_name}
for hashing (i.e. for hashing purposes, assume all sub-queries are already cached)?
There have been discussions recently regarding the desire for users who prefer to write their own SQL queries to be able to make more use of FlowMachine's caching.
It seems to me that we could go some way towards supporting this use case by making it possible to pass query objects as sub-queries of a
CustomQuery
.We could do this by adding a new, optional argument to
CustomQuery
(e.g.dependencies
) which accepts a dict of Query objects, which could be matched to placeholders in the SQL string.Something like:
The dependencies could then be included in
query.dependencies()
, andquery._make_query
would return something along the lines ofwhich would then make use of the cached dependencies where they exist.
It woul be good to hear from FlowMachine users / potential users as to whether this is something that would be useful in practice.