dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.63k stars 1.59k forks source link

[CT-369] [Feature] Use variables declared in dbt_project.yml file in the file #4873

Open solomonshorser opened 2 years ago

solomonshorser commented 2 years ago

Is there an existing feature request for this?

Describe the Feature

When I declare a variable in a dbt_project file, I sometimes need to use it in the same file. For example:

vars:
  special_schema: "my_special_schema"

models:
  my_staging_models:
    staging:
      +materialized: view
      common:
        +enabled: true
        +schema: "{{ var('my_special_schema') }}"

special_schema is declared in the project-level vars because I need it in several places. I also need to specify this schema as the schema for some models. Currently when I do the above, I get an error:

Compilation Error Could not render {{ var('my_special_schema') }}: Required var 'my_special_schema' not found in config: Vars supplied to = {}

Describe alternatives you've considered

Currently, I am adding a config block to every model to specify the schema, and that works, but with many models, it gets very repetitive. It would be nice to specify the schema once (using a variable), in the project file, and not have to do it again.

I suppose I could also include --vars "my_special_schema: some_schema" when running DBT, but for ease of use by others, I'd prefer to use the vars: block in the project file. We have many variables, and there are a few more I'd like to use in this way. Passing them all from the CLI makes for messier scripts.

Who will this benefit?

Everyone who wants to make use of variables in the project file, for variables that are declared in the project file.

Are you interested in contributing this feature?

Yes, but I'm not sure I have the necessary skills or time.

Anything else?

No response

gshank commented 2 years ago

It would make a lot of sense to be able to separate out the variables. I could see doing this in two different ways: moving the vars into their own file or pulling the vars out of the dbt_project file first and using them to process the rest of the file.

solomonshorser commented 2 years ago

Having the vars in a separate file as one option could make life easier for some devops people as they could move a project around between environments and instead of modifying the vars in the project file, they could just swap different vars files in different environments and leave everything else untouched.

joaoheron commented 2 years ago

Looking forward to have something like this as well. Thanks for the initiative @solomonshorser

github-actions[bot] commented 1 year ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 1 year ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

solomonshorser commented 1 year ago

So will this feature be available in the future?

TwBridges commented 1 year ago

Upvoting this as I am also running into this issue

msohaill commented 1 year ago

Not sure if it's been mentioned and albeit redundant, you could look into the default parameter to the var() function. Docs.

vars:
  special_schema: "my_special_schema"

models:
  my_staging_models:
    staging:
      +materialized: view
      common:
        +enabled: true
        +schema: "{{ var('special_schema', 'my_special_schema') }}"
TwBridges commented 1 year ago

I'm trying to set a config for metadata and reference it like below

alerting_config: 
    team_alerting: 
      owner: ['@test_user1, '@test_user2']
      slack_channel: '#data-team-dbt-alerting'
    ...etc

models:
  core:
   +schema: 'core'
   +meta:
      owner: {{ var('alerting_config')['team_alerting']['owner'] }}"
      channel: "{{ var('alerting_config')['team_alerting']['slack_channel'] }}"
   ...etc 

But am also running into the issue outlined above

foundinblank commented 1 year ago

Same here. It'd be awesome to have vars defined in dbt_project.yml also be reference-able within that file too. It would help keep things DRY

jtcohen6 commented 1 year ago

Reopening for now. No update on priority or timeline; I just know this is a small pain felt by many.

geo909 commented 10 months ago

Are there any updates on this ticket? Is there any chance that this is going to be considered anytime soon?

wcmbishop commented 5 months ago

+1 on this feature request. I'm hitting the same constraint when trying to leverage vars in yaml config within dbt_project.yml -- specifically, trying to define env specific grants...

models:
  my_project:
    staging:
      +materialized: view
      +schema: staging
      +grants:
        select:
          - "developer"
          - >
            {%- if target.name == 'prod' -%}reporter
            {%- else -%}{{ var('default_group') }}
            {%- endif -%}

vars:
  default_group: "transformer"
kverburg commented 3 months ago

Would love this feature!

GKTheOne commented 2 months ago

If variables are usable in dbt_project.yml then I would want a means to be able to externally access values after interpolation. ie, dbt info --format json (maybe also needing the --quiet flag) to output to stdout a json blob of the processed dbt_project.yml file. (json is just an example, I would like it to output yaml by default)

I can go into technical details if wanted but simply, I have some library code that wants to know the location of the dbt artifacts. For production it is easy enough to parse/compile the dbt project and put the artifacts in a known/static location during the (CI) build process for the library code to find. During development, the library code will parse/compile the project at startup to ensure the artifacts are fresh.

The complication is that the dbt target path is configurable, so now the library code needs to know where the target path is. We can enforce a given target path when running dbt, but what about other tooling? Maybe the developer has changed the target-path value for a reason. Who knows 🤷, and its not up to the code to tell the developer they can't.

A simple solution is to read and parse the dbt_project.yml file. target-path is a simple scalar value, just a quick yaml-load + value-fetch away. Now we know where artifacts will be produced, done. If variables are allowed everywhere in dbt_project.yml, ie, if target-path could be a variable, then the library code now needs to perform variable-lookup and jinja interpolation 😨.

So now I'm no longer sure that I want variables to be used in the dbt_project.yml file 🤔, (but I still would like to be able to use variables in sub-section configs like flags, models, sources, etc)

Possible solutions exist to allow the use of variables in the dbt_project.yml file, and I'm sure that one day this will happen. When it does I will definitely want a means for external tooling to be able to know information about the dbt project without having to handle parsing.