Open dirkgroenen opened 3 years ago
Reflecting on this, my current thought is that we should go for a strategy to strictly separate the interpretation of the keys. I think that will lead to better error messages and simpler documentation.
sql_metrics
-> always expect a list of dicts
sql_metrics_includes
-> expects a list of file paths (with optional wild cards)
Similarly
sql_metrics.sql_file
-> file path
This would imply we may have to revisit the named tests as we currently also allow for 2 flavours on 1 key tests
. We could change that to
tests
-> list of unnamed test expressions
named_tests
-> dictionary of tests
Other thoughts?
I remember your suggested flavor to have been discussed in the engineering meeting leading up to the proposal as in this issue's OP, but there were some motivations from the team against it which were mostly related to verbosity, maintenance and reduced flexibility.
Some arguments for using include/read
functions which I can remember:
include
and require
for pretty much every key of you like, which thus' allows you to separate for example sql_metrics, or tests, or just the sql
in a sql_metric. I can't remember all arguments from the discussion, so don't pin me on it.
My personal vote goes for the "function" approach given it's a universal solution which only has to be documented and explained once, where as the verbose xxx_list
, xxx_file
, xxx_include
requires (imo) more documentation.
So for sql_metrics you reproposing to go for offering different options like this?
sql_metrics: include(./path/*.yml)
sql_metrics:
- include(./path/*.yml)
- read(./anotherpath/someother.yml)
and
sql_metrics:
- sql: ...
tests: ...
- sql: ...
tests: ...
Where the last 2 could actually be combined.
And also
sql_metrics:
- sql: ...
tests: ...
- sql: read(./anotherpath/someother.sql)
tests: ...
I personally do like that flavor, yes
Note how read
could also be used to resolve https://github.com/sodadata/soda-sql/issues/166.
We should make sure sodadata/soda-sql#267 this is fixed as we implement this feature ([bug] sql_file is recongnized as valid sql_metric key sodadata/soda-sql#267)
As discussed earlier last weekend's code restructure should allow for more freedom when defining a soda-sql project directory structure. To accomplish this we no longer make 'directory' assumptions and instead allow developers to inline their sql_metrics in the scan-yml file.
In order to prevent scan-yml files from getting too large we should also support 'macro functions' which allow developers to
include
multiple files orread
the contents of a file. As proposed we should consider adding support for either one, or both, of the following functions:include(./path/*.yml)
Thisinclude()
function can be used in combination withsql_metrics:
to include the contents of a list of files. The first argument can be a path containing wildcards or a directory which will then read and parse allyml
files.read(./path/query.sql)
Theread()
function allows to read the contents of a given file and use it as the properties value. This can for example be used in combination with thesql_metrics[].sql
property when referring to ametric.sql
query file.An example of a scan-YML file which uses
read()
could then look like:An example using
include()
would look like:Up for discussion:
read()
andinclude()
or do we want to merge them into one?My two cents: merging into one might create more
automagical behavior
and can therefore be less preferred.