z3z1ma / dbt-osmosis

Provides automated YAML management, a dbt server, streamlit workbench, and git-integrated dbt model output diff tools
https://z3z1ma.github.io/dbt-osmosis/
Apache License 2.0
422 stars 45 forks source link

`dbt-osmosis yaml refactor` only populates `data_type` for some columns. #90

Closed waligob closed 9 months ago

waligob commented 9 months ago

Upon upgrading from 0.11 to 0.12.1 and running dbt-osmosis yaml refactor. I found that some, but not all columns are being populated with data_type (added in #82), and I can't make sense of why columns are being excluded. Here's an example diff before and after running dbt-osmosis yaml refactor:

image

and here is the associated log output from dbt-osmosis yaml refactor:

INFO     👉 Processing model: model.inquirer_dbt.dim_emails       osmosis.py:865
INFO     🔍 Resolving columns in database
INFO     🔬 Looking for actions for                              osmosis.py:1129
         model.inquirer_dbt.dim_emails                                          
INFO     ✨ Schema file is up to date for model                   osmosis.py:969
         model.inquirer_dbt.dim_emails  

I am running on dbt-bigquery = "1.6.3" and dbt-osmosis = "0.12.1"

syou6162 commented 9 months ago

@waligob FYI @z3z1ma I read the code and it seems that data_type only works for undocumented_columns. The job_id / email_subject columns in the screenshot you posted are already populated with a description and not in undocumented_columns, so I assume that the data_type is not populated. I think that the data_type was not entered.

https://github.com/z3z1ma/dbt-osmosis/blob/main/src/dbt_osmosis/core/osmosis.py#L1081-L1093

If you want to use osmosis to populate data_type for columns that already have a description populated, it would be good to try --force-inheritance option.

  -F, --force-inheritance If specified, forces documentation to be inherited overriding existing column level documentation where
                            ```` -F, --force-inheritance