dbt-labs / dbt-codegen

Macros that generate dbt code
https://hub.getdbt.com/dbt-labs/codegen/latest/
Apache License 2.0
459 stars 99 forks source link

`generate_model_yaml` keeps the docs blocks with `upstream_descriptions` #143

Closed tuliolima closed 6 months ago

tuliolima commented 10 months ago

Describe the feature

My team uses docs blocks to define the column descriptions and reuse throughout the project. When generating the yaml of a model with generate_model_yaml and setting upstream_descriptions = True, we expect that column descriptions will receive the same docs blocks defined in the upstream models instead of receiving the description text itself. This behavior could be controlled with a parameter like keep_docs_blocks.

Describe alternatives you've considered

When using docs blocks, I generate the yaml without descriptions and then fill the descriptions manually with the doc macros.

Additional context

This is not database-specific.

Who will this benefit?

This will speed up the work of those who want to reduce the docs duplication and inconsistency.

Are you interested in contributing this feature?

Yes, but I have no experience with the project code.

kbj96 commented 6 months ago

Hi, @tuliolima I was also looking for a solution to this question as well! I'm just starting with DBT so I couldn't solve the code itself. However, as I did kind of make a tempory solution, I am sharing it.

The way I went through was using the docs-paths and making a copy of the .md files that will result in calling back the same text that was called.

1. Create a path to hold the .md files that will be used to for 'docs-paths' setting within the generate_model_yaml.

I made docs_ on the same directory as my dbt_project.yml file.

2. Set the docs-paths within the dbt_project.yml to the created new path.

3. Create .md files that will contain the doc_blocks that will be used for generate_model_yaml and make each of your doc_blocks as below. {% docs doc_block_name %} {% raw %} {{ doc('doc_block_name') }} {% endraw %} {% enddocs %}

These doc_blocks will be called instead of your real doc_blocks by setting the docs-path to these .md files.
The way this works is the raw tag is used to escape from jinja. As a result, the output value becomes the literal values between the raw tags.

Also, after making the documents you want, you should change the docs-path back to where it was so it go back to read the real docs blocks!

Hope this helps!

gwenwindflower commented 6 months ago

I looked into this, unfortunately, the manifest.json that is used to do all the magic here already has the evaluated/compiled string from the doc block set in it, so we can't pass the doc block code itself from upstream 😢 . I don't see a great way to accomplish this without changes to dbt-core and how we handle docs blocks. Bummer, it's a great idea!