dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.61k stars 1.59k forks source link

[Bug] state:modified not working for windows machine #10574

Open abrown-calix opened 4 weeks ago

abrown-calix commented 4 weeks ago

Is this a new bug in dbt-core?

Current Behavior

We have set environment variables DBT_DEFER=true and DBT_STATE= path to prod manifest file. When we run dbt run -m and look at the manifest.json the deferred is true (deferred: true). However when we run command: dbt run -m state:modified or dbt run --select state:modified the deferred is false (deferred: false). This causes dbt to run ALL models as the database listed in the manifest.json file is not correct.

This seems to only happen for those with windows machines. We have a few folks on our team that have macs and everything works fine.

Expected Behavior

When we run: dbt run -m state:modified, it should ONLY run the modified model not all models. The deferred flag in the manifest.json should be deferred: true.

Steps To Reproduce

  1. Set up environment variables in the system config (System->Advanced system settings-> Advanced -> Environment Variables. Add DBT_DEFER true, DBT_STATE /path/to/production/manifest
  2. Close terminal
  3. Open terminal
  4. Pull up master branch
  5. run command: dbt ls -m state:modified. It should state: Got a state selector method, but no comparison manifest. Instead it lists ALL our models

Relevant log output

Incorrect manifest:

"nodes": {
        "model.analytics_edw.dim_product_oracle": {
            "database": "dbt_dev_agoncalv",
            "schema": "audit",
            "name": "dim_product_oracle",
            "resource_type": "model",
            "package_name": "analytics_edw",
            "path": "audit\\dim_product_oracle.sql",
            "original_file_path": "models\\audit\\dim_product_oracle.sql",
            "unique_id": "model.analytics_edw.dim_product_oracle",
            "fqn": [
                "analytics_edw",
                "audit",
                "dim_product_oracle"
            ],
            "alias": "dim_product_oracle",
            "checksum": {
                "name": "sha256",
                "checksum": "47a67bbdf5791a7dc7767c39dfbd0cfc151a6f7d406bee06c23665fdb48d5032"
            },
            "config": {
                "enabled": true,
                "alias": null,
                "schema": "audit",
                "database": null,
                "tags": [
                    "audit",
                    "prod_ops_multi_source"
                ],
                "meta": {},
                "group": null,
                "materialized": "table",
                "incremental_strategy": null,
                "persist_docs": {},
                "post-hook": [],
                "pre-hook": [],
                "quoting": {},
                "column_types": {},
                "full_refresh": null,
                "unique_key": "item_master_id",
                "on_schema_change": "ignore",
                "on_configuration_change": "apply",
                "grants": {},
                "packages": [],
                "docs": {
                    "show": true,
                    "node_color": null
                },
                "contract": {
                    "enforced": false,
                    "alias_types": true
                },
                "access": "protected"
            },
            "tags": [
                "audit",
                "prod_ops_multi_source"
            ],
            "description": "The data from this table is derived from PC_FIVETRAN_DB.PROD_OPS.ITEM_MASTER_110.",
            "columns": {
                "item_master_id": {
                    "name": "item_master_id",
                    "description": "",
                    "meta": {},
                    "data_type": null,
                    "constraints": [],
                    "quote": null,
                    "tags": []
                }
            },
            "meta": {},
            "group": null,
            "docs": {
                "show": true,
                "node_color": null
            },
            "patch_path": "analytics_edw://models\\edw\\product_ops\\_schema.yml",
            "build_path": null,
            "deferred": false,

Correct manifest:

"nodes": {
        "model.analytics_edw.dim_product_oracle": {
            "database": "dbt_dev_agoncalv",
            "schema": "audit",
            "name": "dim_product_oracle",
            "resource_type": "model",
            "package_name": "analytics_edw",
            "path": "audit\\dim_product_oracle.sql",
            "original_file_path": "models\\audit\\dim_product_oracle.sql",
            "unique_id": "model.analytics_edw.dim_product_oracle",
            "fqn": [
                "analytics_edw",
                "audit",
                "dim_product_oracle"
            ],
            "alias": "dim_product_oracle",
            "checksum": {
                "name": "sha256",
                "checksum": "47a67bbdf5791a7dc7767c39dfbd0cfc151a6f7d406bee06c23665fdb48d5032"
            },
            "config": {
                "enabled": true,
                "alias": null,
                "schema": "audit",
                "database": null,
                "tags": [
                    "audit",
                    "prod_ops_multi_source"
                ],
                "meta": {},
                "group": null,
                "materialized": "table",
                "incremental_strategy": null,
                "persist_docs": {},
                "post-hook": [],
                "pre-hook": [],
                "quoting": {},
                "column_types": {},
                "full_refresh": null,
                "unique_key": "item_master_id",
                "on_schema_change": "ignore",
                "on_configuration_change": "apply",
                "grants": {},
                "packages": [],
                "docs": {
                    "show": true,
                    "node_color": null
                },
                "contract": {
                    "enforced": false,
                    "alias_types": true
                },
                "access": "protected"
            },
            "tags": [
                "audit",
                "prod_ops_multi_source"
            ],
            "description": "The data from this table is derived from PC_FIVETRAN_DB.PROD_OPS.ITEM_MASTER_110.",
            "columns": {
                "item_master_id": {
                    "name": "item_master_id",
                    "description": "",
                    "meta": {},
                    "data_type": null,
                    "constraints": [],
                    "quote": null,
                    "tags": []
                }
            },
            "meta": {},
            "group": null,
            "docs": {
                "show": true,
                "node_color": null
            },
            "patch_path": "analytics_edw://models\\edw\\product_ops\\_schema.yml",
            "build_path": null,
            "deferred": true,

Environment

- OS:Windows
- Python:3.12
- dbt:1.7.9
Plugins
    - snowflake: 1.7.2

Which database adapter are you using with dbt?

snowflake

Additional Context

image (2)

image (1)

dbeatty10 commented 3 weeks ago

Thanks for reaching out @abrown-calix !

Could you double-check that the DBT_DEFER and DBT_STATE environment variables are available in their Windows terminal?

The precise instructions to check those environment variables will vary depending on the Windows terminal they are using. But I'll try to include a few of the most common ones.

Command Prompt (CMD)

echo %DBT_DEFER%
echo %DBT_STATE%

PowerShell

$env:DBT_DEFER
$env:DBT_STATE

Windows Subsystem for Linux (WSL) or another Unix-like shells on Windows

echo $DBT_DEFER
echo $DBT_STATE
abrown-calix commented 3 weeks ago

image in each they are reflecting accurately.

dbeatty10 commented 3 weeks ago

Could you try a few other things and see if any of them work for the Windows users?

abrown-calix commented 2 weeks ago

image image image

We have already tried to upgrade to dbt 1.8 and it always is running all the models unfortunately.

abrown-calix commented 2 weeks ago

We have tried with quotes image. One of our windows users has python 3.11.8 and it is behaving the same.