dbt-labs / dbt-codegen

Macros that generate dbt code
https://hub.getdbt.com/dbt-labs/codegen/latest/
Apache License 2.0
459 stars 99 forks source link

Fill in model/column descriptions when available in database metadata #160

Open jenna-jordan opened 6 months ago

jenna-jordan commented 6 months ago

Describe the feature

Model and column descriptions can be stored in the database using the persist_docs feature. It is sometimes useful to "regenerate" the YAML files for models when tweaks have been made. However, the regenerated YAML has empty descriptions, meaning you need to manually copy and paste the old descriptions. If these descriptions have been stored in the database (persist_docs is set to true), then that could be used as the source to fill in already written descriptions.

Describe alternatives you've considered

Manually copy and paste descriptions from the old yaml into the new yaml.

Additional context

Different databases may store these descriptions differently, but you should be able to reference how persist_docs handles it.

Who will this benefit?

Anybody using codegen to automate the creation of source/model YAML files with descriptions already written.

Are you interested in contributing this feature?

I'm not currently able to

gwenwindflower commented 5 months ago

Yep, makes sense, let me ask our Core DX embed how persist_docs works and see how feasible this is, but if it's relatively doable this totally makes sense, especially now that we're pulling descriptions from sources as well as of yesterday, it makes sense that sources+models should be able to pull from the db comments or metadata if it's not too gnarly to pull off.

jgooly commented 5 months ago

+1

In our case, we have several teams that consume Snowflake tables built by other teams (not using dbt) that have column descriptions as comments. We'd like to extract those comments as column descriptions when create the source yml files.