calogica / dbt-expectations

Port(ish) of Great Expectations to dbt test macros
https://calogica.github.io/dbt-expectations/
Apache License 2.0
989 stars 120 forks source link

Fix sum_total in expect_multicolumn_sum_to_equal to accept column names #291

Closed VDFaller closed 8 months ago

VDFaller commented 8 months ago

Issue this PR Addresses/Closes

Closes #290

Summary of Changes

Just wrapped sum_total in Sum()

Why Do We Need These Changes

This will allow a column name or a literal number to be used.

Reviewers

@clausherther

clausherther commented 8 months ago

Great catch, thanks for the PR! I think this is one of those holdovers from Great Expectations where tests are built against static values vs totals of other columns. Probably worth looking other tests as well.

Would you mind also adding a test that covers this new use case of testing against the sum of another column? We currently only have a static test, i.e.

      - dbt_expectations.expect_multicolumn_sum_to_equal:
          column_list: ["col_numeric_a", "col_numeric_b"]
          sum_total: 4

so we'll need something like

      - dbt_expectations.expect_multicolumn_sum_to_equal:
          column_list: ["col_numeric_a", "col_numeric_b"]
          sum_total: new_totals_column
VDFaller commented 8 months ago

Hopefully this is what you're looking for. If not let me know.

clausherther commented 8 months ago

Hopefully this is what you're looking for. If not let me know.

I'll check it out! FYI, you probably figured this out since, but sum(4) against the table return 16 b/c it does this for every one of the 4 rows. This may actually mean this is a breaking change. Any config that checks for 4 would now fail. Any thoughts?

clausherther commented 8 months ago

FYI, updated the CI config on this PR to get Spark CI to pass. For some reason our CI configs breaks in dbt 1.7.x, but I'll fix that separately. CI is passing for you now 👍