Dataset filter is a very useful thing, however it is not supported when using "for each dataset".
In order to give you a context on why/where this will be useful, let's think about the periodically checks (weekly, daily, hourly) as a simplest example.
We do not want to perform checks on the entire table every day/hour, but want to check data only for last 24 hours (+ some other filters).
So filter like where cloud = 'ci' and dt > '2023-02-23' should be applied on each table from the "for each" section.
The same way of using dataset filters would be useful in reference and cross checks as well. Sometimes we want to compare datasets using the same filter (day, deployment, start_ts...) from the same data source (reference) or from a different data source such as PostgreSQL and Databricks_SQL (cross).
Dataset filter is a very useful thing, however it is not supported when using "for each dataset". In order to give you a context on why/where this will be useful, let's think about the periodically checks (weekly, daily, hourly) as a simplest example. We do not want to perform checks on the entire table every day/hour, but want to check data only for last 24 hours (+ some other filters). So filter like
where cloud = 'ci' and dt > '2023-02-23'
should be applied on each table from the "for each" section.The same way of using dataset filters would be useful in reference and cross checks as well. Sometimes we want to compare datasets using the same filter (day, deployment, start_ts...) from the same data source (reference) or from a different data source such as PostgreSQL and Databricks_SQL (cross).