followingell commented 2 years ago

Issue

Closes https://github.com/slidoapp/dbt-coverage/issues/24:

"Please add options in the CLI to include and exclude models to filter out the checks in some of the models or a path."

Summary

I added the ability to perform compute commands on only a subset of tables by adding a --model-path-filter option. This means that a subset of models can be selected based upon their original_file_path value (taken from the manifest.json artifact).

This functionality means that dbt-coverage can now be used in monolithic dbt projects which contain sub-projects owned by different teams. Before adding model selection functionality, using dbt-coverage would not have been useful/advisable in such a structure because another, unrelated team may decrease the overall coverage, which can then block PR merging (should dbt-coverage have been integrated as part of a CI/CD pipeline for example).

See example of added functionality from updated README.md below:

$ cd jaffle_shop
$ dbt run  # Materialize models
$ dbt docs generate  # Generate catalog.json and manifest.json
$ dbt-coverage compute doc --cov-report coverage-doc.json --model-path-filter models/staging/  # Compute doc coverage for a subset of tables, print it and write it to coverage-doc.json file

Coverage report
======================================================
jaffle_shop.stg_customers              0/3       0.0%
jaffle_shop.stg_orders                 0/4       0.0%
jaffle_shop.stg_payments               0/4       0.0%
======================================================
Total                                  0/11      0.0%

$ dbt-coverage compute doc --cov-report coverage-doc.json --model-path-filter models/orders.sql --model-path-filter models/staging/  # Compute doc coverage for a subset of tables, print it and write it to coverage-doc.json file

Coverage report
======================================================
jaffle_shop.orders                     0/9       0.0%
jaffle_shop.stg_customers              0/3       0.0%
jaffle_shop.stg_orders                 0/4       0.0%
jaffle_shop.stg_payments               0/4       0.0%
======================================================
Total                                  0/20      0.0%

Note: this is a relatively 'rough' solution and there are likely many improvements that could be made to my code / far more elegant implementations that would achieve the same functionality. Please, feel free to suggest changes!

Testing

I have tested these changes on dbt's jaffle_shop 'testing project' and have not encountered issues so far.

followingell commented 2 years ago

Just checking, is anyone available to review this please @sweco, @mrshu?

Is there anything that I can add to the PR to make the review process easier for yourselves?

mrshu commented 2 years ago

Thanks for the PR @followingell -- I'll take a closer look at it later today 🙂

followingell commented 2 years ago

@followingell Is there any specific reason for PurePath here as opposed to using Path that has already been in use?

@mrshu Not sure where the above comment has gone(?), regardless responding here rather than via email:

I think this StackOverflow answer sums it up quite nicely. Essentially, PurePath just performs string-like operations whereas Path can also do I/O operations which we don't need here. As such, I chose to utilise the simpler, parent class.

If you'd rather I just use Path then I'm happy to do so.

followingell commented 2 years ago

@mrshu FYI for now I have finished with changes. As such, ready for review 👍

mrshu commented 2 years ago

Thanks again @followingell , this was now released in 0.3.0!

slidoapp / dbt-coverage

Allow model filtering #45

Issue

Summary

Testing