astronomer / astronomer-cosmos

Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
https://astronomer.github.io/astronomer-cosmos/
Apache License 2.0
767 stars 170 forks source link

[Bug] LoadMode.DBT_MANIFEST dbt deps not taking into account dbt vars #1188

Open fabiomx opened 2 months ago

fabiomx commented 2 months ago

Astronomer Cosmos Version

Other Astronomer Cosmos version (please specify below)

If "Other Astronomer Cosmos version" selected, which one?

1.6.0

dbt-core version

1.5.4

Versions of dbt adapters

dbt-snowflake 1.5.4

LoadMode

DBT_LS_MANIFEST

ExecutionMode

LOCAL

InvocationMode

DBT_RUNNER

airflow version

2.7.3

Operating System

Ubuntu 20.04.6 LTS

If a you think it's an UI issue, what browsers are you seeing the problem on?

No response

Deployment

Google Cloud Composer

Deployment details

No response

What happened?

The issue looks similar to this one https://github.com/astronomer/astronomer-cosmos/issues/1112, but with DBT_MANIFEST mode.

When running dbt deps, the dbt variables are not being considered, while they are correctly retrieved during dbt run. See the log below.

Relevant log output

...
[2024-09-03, 18:14:30 CEST] {local.py:303} INFO - Trying to run dbtRunner with:
 ['deps', '--project-dir', '/tmp/tmp1b6fhdl6', '--profiles-dir', '/tmp/cosmos/profile/7345009a52a3d9602f0de064bd49df4ffa6f92ca45895deae8ab579d1dba90e3', '--profile', 'project_name_xxx', '--target', 'dev']
 in /tmp/tmp1b6fhdl6
...
[2024-09-03, 18:14:38 CEST] {local.py:303} INFO - Trying to run dbtRunner with:
 ['run', '--vars', 'VAR_1_NAME: VAR_1_VALUE\n', '--models', 'model_name', '--project-dir', '/tmp/tmp1b6fhdl6', '--profiles-dir', '/tmp/cosmos/profile/7345009a52a3d9602f0de064bd49df4ffa6f92ca45895deae8ab579d1dba90e3', '--profile', 'project_name_xxx', '--target', 'dev']
 in /tmp/tmp1b6fhdl6

How to reproduce

1) Enable deps installation ("install_deps": True in operator_args) 2) Set dbt_vars in ProjectConfig 3) Run DbtDag/DbtTaskGroup

Anything else :)?

No response

Are you willing to submit PR?

Contact Details

No response

dosubot[bot] commented 2 months ago

I found a similar open issue that might be relevant to your problem: [Bug] . The suggested solution involves ensuring that dbt_vars are correctly passed and utilized during the dbt deps command. Here are the key points:

  1. Using env_var() vs. var():

    • env_var() works fine because it fetches environment variables at runtime.
    • var() and dbt_vars need to be resolved during the compilation phase of dbt.
  2. Setting dbt_vars:

    • Ensure RenderConfig.dbt_deps is set to True.
    • Define dbt_vars in the project configuration.
    • Pass dbt_vars through the operator_args dictionary when instantiating DbtDag or DbtTaskGroup.

Here is an example of setting dbt_vars in operator_args:

DbtTaskGroup(
    # ...
    operator_args={
        "vars": {
            "start_time": "{{ data_interval_start.strftime('%Y%m%d%H%M%S') }}",
            "end_time": "{{ data_interval_end.strftime('%Y%m%d%H%M%S') }}",
        },
    }
)

Additionally, the dbt deps command does not consider dbt variables in LoadMode.DBT_MANIFEST because the variables are not included in the manifest file. The manifest file is a static artifact generated by dbt that does not dynamically incorporate variables. In contrast, dbt run processes the dbt project in real-time, allowing it to correctly retrieve and apply variables [[id=1102]][1].

To ensure dbt deps correctly takes dbt_vars into account, you need to explicitly add the variables to the command arguments. This can be done by modifying the run_dbt_deps method to include the --vars flag with the dbt variables [2].

If you need further assistance, please let me know!

To continue talking to Dosu, mention @dosu.

github-actions[bot] commented 2 weeks ago

This issue is stale because it has been open for 30 days with no activity.