ccao-data / data-architecture

Codebase for CCAO data infrastructure construction and management
https://ccao-data.github.io/data-architecture/
5 stars 3 forks source link

Explicitly differentiate dbt unit tests from dbt data tests #382

Open dfsnow opened 1 month ago

dfsnow commented 1 month ago

dbt will soon include unit tests for complex SQL logic and queries. We should absolutely use these and should also attempt to differentiate tests on raw data (test_qc_ tests) from tests related to logic (tests for joins). We could do this via tagging.

We should also revisit our testing docs to clarify the ways our unit tests are written and run once we have a better idea of what they'll look like. See this comment for some ideas: https://github.com/ccao-data/data-architecture/pull/432#discussion_r1592724380

jeancochrane commented 1 month ago

Note that we will also need to wait for dbt-athena to cut a release that supports dbt 1.8 before we can use unit tests. Luckily it seems like they plan to do this immediately, and the release is currently targeted for May 9.

jeancochrane commented 1 week ago

Our old friend https://github.com/ccao-data/data-architecture/issues/238 bites us again: Unit tests use CTEs to template their fixtures into the compiled test query, meaning that until models with dots in their names are supported by dbt Core, we can't unit test any model that references another model ☹️

@dfsnow Do you think this is enough of a motivation to boost the priority of https://github.com/ccao-data/data-architecture/issues/238, or should we backburner unit tests for now?

dfsnow commented 1 week ago

Our old friend #238 bites us again: Unit tests use CTEs to template their fixtures into the compiled test query, meaning that until models with dots in their names are supported by dbt Core, we can't unit test any model that references another model ☹️

@dfsnow Do you think this is enough of a motivation to boost the priority of #238, or should we backburner unit tests for now?

Oof. The main problem here is that you can't have periods in CTE names right? That seems more like an Athena/dbt-athena problem to me. Perhaps there's a way to fix dbt-athena such that ephemeral models with dots are renamed.

Either way, let's check-in and see what else is on the list of upcoming stuff before we decide on priority.