Closed jaredx435k2d0 closed 3 months ago
@jaredx435k2d0 we do not propagate source table descriptions.
There can be 100 tables that use some subset of columns from table_A, we would not want to propagate a description from table_A to all 100 clearly different tables. Table descriptions do not propagate. Now a good question lies around the fuzzy boundary between source -> staging models since, typically by virtue of "best practices", they are 1:1. Still we cannot rely on that without some connective tissue, Some metadata.
Agreed on all points!
Now a good question lies around the fuzzy boundary between source -> staging models since, typically by virtue of "best practices", they are 1:1 This specifically would be something I'd expect as a user. That said, I hear you when you say you can't rely on a given project following these practices.
Just to be clear, when yaml document
says it
Automatically generate documentation based on upstream documented columns
, where / when exactly does this apply? Is it infinite from model to model (excluding, as you've said, sources)?
@jaredx435k2d0 So, its specifically for columns, and what it does exactly is this:
The noteworthy details of this are that column names must be matched 1:1 and if you have models A->B->C->D and model A & C have a column called order_number
. When evaluating D, the definition from C will have precedence, as it will overwrite the key in the knowledge dict. When evaluating B, it obviously won't use info from C since its a different fork in the DAG. The effect is pretty natural inheritance. The real opportunity lies in leveraging LLMs to fill in the blanks.
Closing this issue. We support camel case <-> snake case & case insensitive understanding implicitly. Outside of that, we should have a more specific issue if there is a pain here.
Given the following, when I run
yaml refactor
, the description from the source tableaddresses
doesn't bubble up to the stg_some_schema__addresses staging model _stg_some_schema.yml (even though the description from theID
field does):_this_db__some_schema__sources.yml:
_stg_some_schema.yml:
stg_some_schema__addresses.sql:
I'm doing one .yml file per source, and one .yml file per corresponding staging schema
Python 3.10.10 dbt-core==1.4.5 dbt-extractor==0.4.1 dbt-osmosis==0.11.14 dbt-snowflake==1.4.2