Open aranke opened 1 month ago
@aranke Thanks for raising this issue and linking those two other places where this has come up!
If the model is really that simple, what would be the value of a data test here? Wouldn't the counts be guaranteed to be equal? I don't see how the count of rows could be anything but exactly the same in this case.
Hi adding my 2 cents here and I raised this with dbt support too. It's useful in our case when the unioned model is incremental.
i.e. we're selecting from stg_my_source_a
where timestamp > now - 2 days for example. And want to see if the unioned model still has all the rows in the individual staging models.
Hope that makes sense.
Thanks for that info @edmei1 !
Could you share an example model that shows more detail? Something like this:
models/int_my_source.sql
{{
config(
materialized='incremental'
)
}}
with unioned_models as (
select * from {{ ref('stg_my_source_a') }}
union all
select * from {{ ref('stg_my_source_b') }}
union all
select * from {{ ref('stg_my_source_c') }}
)
select *
from unioned_models
{% if is_incremental() %}
where event_time >= (select coalesce(max(event_time),'1900-01-01') from {{ this }} )
{% endif %}
Describe the feature
There are situations where a model simply is a
union all
of various other models:In this example, it would be useful to have a 1:many test to verify that:
Describe alternatives you've considered
equal_rowcount
is the closest thing indbt-utils
, but only does 1:1 relationsAdditional context
I don't think this feature is database-specific.
Who will this benefit?
This feature has been requested several times over the years (dbt internal Slack, previous GitHub issue); this feature will benefit those users.
Are you interested in contributing to this feature?
Sure, happy to help.