dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.64k stars 1.6k forks source link

WritableManifest has invalid value #10557

Closed 7pandeys closed 1 month ago

7pandeys commented 1 month ago

Is this a new bug in dbt-core?

Current Behavior

WritableManifest has invalid value when upgrading dbt version from 1.4 to 1.8

Expected Behavior

manifest.json should get created

Steps To Reproduce

  1. setup dbt 1.4.3 with incremental model
  2. upgrade the setup to 1.8 version it will be required to add on_schema_change parameter so add on_schema_change parameter
  3. trigger command dbt build --threads 8 --exclude resource_type:seed resource_type:test --select @state:new @state:modified.body @state:modified.configs @state:modified.macros --state ./dbt-artifacts --vars {"full_data": false}

Relevant log output

No response

Environment

- OS: linux
- Python:3.9
- dbt:1.4 to 1.8

Which database adapter are you using with dbt?

bigquery

Additional Context

No response

### Tasks
dbeatty10 commented 1 month ago

Thanks for reaching out @7pandeys !

I wasn't able to replicate this with the following simple example:

{{
    config(
        materialized='incremental'
    )
}}

select 1 as id

Using dbt 1.4:

dbt build --target-path dbt-artifacts

Then change select 1 as id to select 2 as id so that it will be picked up by state:modified.body.

Switch to dbt 1.8:

dbt build --threads 8 --exclude resource_type:seed resource_type:test --select @state:new @state:modified.body @state:modified.configs @state:modified.macros --state ./dbt-artifacts --vars '{"full_data": false}'

Could you share a simplified example of your relevant model(s) along with your log output so we can try to reproduce what you are seeing?

7pandeys commented 1 month ago

Example: 1.4

models:
  - name: my_model
    config:
      contract:
        enforced: true
        alias_types: false  # true by default

1.8

models:
  - name: my_model
    config:
      contract:
        enforced: false
        alias_types: false  # true by default

As you can see, I have intention to disable contract in v1.8 version

https://github.com/dbt-labs/dbt-core/issues/10298 - it is same case which I am facing now in CI(jenkins).

dbt-bigquery version: 1.8.2 dbt-core version :1.8.5

dbeatty10 commented 1 month ago

Thanks for providing this @7pandeys, but I still wasn't able to reproduce any errors.

Here's what I tried:

v1.4

{{
    config(
        materialized='incremental',
    )
}}

select 1 as id
version: 2

models:
  - name: my_model
    config:
      contract:
        enforced: true
        alias_types: false  # true by default

Run these commands:

dbt build --target-path dbt-artifacts

v1.8

{{
    config(
        materialized='incremental',
        on_schema_change='fail',
    )
}}

select 2 as id
version: 2

models:
  - name: my_model
    config:
      contract:
        enforced: false
        alias_types: false  # true by default

Run these commands:

dbt build --threads 8 --exclude resource_type:seed resource_type:test --select @state:new @state:modified.body @state:modified.configs @state:modified.macros --state ./dbt-artifacts --vars '{"full_data": false}'

Here's the output that I got:

(dbt_1.4) $ dbt build --threads 8 --exclude resource_type:seed resource_type:test --select @state:new @state:modified.body @state:modified.configs @state:modified.macros --state ./dbt-artifacts --vars '{"full_data": false}'

04:27:52  Running with dbt=1.4.9
04:27:52  Unable to do partial parsing because of a version mismatch
04:27:53  Found 1 model, 0 tests, 0 snapshots, 0 analyses, 291 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
04:27:53  The selection criterion '@state:new' does not match any nodes
04:27:53  The selection criterion '@state:modified.macros' does not match any nodes
04:27:53  The selection criterion 'resource_type:seed' does not match any nodes
04:27:53  The selection criterion 'resource_type:test' does not match any nodes
04:27:53  
04:27:54  Concurrency: 8 threads (target='postgres')
04:27:54  
04:27:54  1 of 1 START sql incremental model dbt_dbeatty.my_model ........................ [RUN]
04:27:54  1 of 1 OK created sql incremental model dbt_dbeatty.my_model ................... [INSERT 0 1 in 0.21s]
04:27:54  
04:27:54  Finished running 1 incremental model in 0 hours 0 minutes and 1.22 seconds (1.22s).
04:27:54  
04:27:54  Completed successfully
04:27:54  
04:27:54  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

Do you get an error if you run these? If not, are you able to tweak them to reproduce the error you are reporting? If you get an error, could you also share your log output?

7pandeys commented 1 month ago

is it possible to test for dbt-bigquery ?

Jenkins ### Logs

dbt build --threads 8 --exclude resource_type:seed resource_type:test --select https://github.com/State:new https://github.com/State:modified.body https://github.com/State:modified.configs https://github.com/State:modified.macros --state ./dbt-artifacts --vars {"full_data": true} �[0m11:24:21 Running with dbt=1.8.5 �[0m11:24:22 [�[33mWARNING�[0m]: Deprecated functionality The tests config has been renamed to data_tests. Please see https://docs.getdbt.com/docs/build/data-tests#new-data_tests-syntax for more information. �[0m11:24:22 Registered adapter: bigquery=1.8.2 �[0m11:24:23 Unable to do partial parsing because config vars, config profile, or config target have changed �[0m11:24:36 Encountered an error: Field "nodes" of type Mapping[str, Union[Seed, Analysis, SingularTest, HookNode, Model, SqlOperation, GenericTest, Snapshot]] in WritableManifest has invalid value

dbeatty10 commented 1 month ago

is it possible to test for dbt-bigquery ?

@7pandeys I changed this example to work with the dbt-bigquery adapter, and it did reproduce the error you saw. See below for details.

v1.4

models/dual.sql

select 1 as dummy

models/my_model.sql

{{
    config(
        materialized='incremental',
        unique_key='id',
    )
}}

select 1 as id, 1 as my_number
from {{ ref("dual") }}

models/_models.yml

version: 2

models:
  - name: my_model
    config:
      contract:
        enforced: true
        alias_types: false  # true by default

Run these commands:

dbt build --target-path dbt-artifacts

v1.8

models/my_model.sql

{{
    config(
        materialized='incremental',
        unique_key='id',
        on_schema_change='fail',
    )
}}

select 2 as id, 2 as my_number
from {{ ref("dual") }}

models/_models.yml

version: 2

models:
  - name: my_model
    config:
      contract:
        enforced: false
        alias_types: false  # true by default

Run these commands:

dbt build --threads 8 --exclude resource_type:seed resource_type:test --select @state:new @state:modified.body @state:modified.configs @state:modified.macros --state ./dbt-artifacts --vars '{"full_data": false}'

Here's the output that I got:

(dbt_1.8) $ dbt build --threads 8 --exclude resource_type:seed resource_type:test --select @state:new @state:modified.body @state:modified.configs @state:modified.macros --state ./dbt-artifacts --vars '{"full_data": false}'
18:57:46  Running with dbt=1.8.3
18:58:10  Registered adapter: bigquery=1.8.2
18:58:11  Encountered an error:
Field "nodes" of type Mapping[str, Union[Seed, Analysis, SingularTest, HookNode, Model, SqlOperation, GenericTest, Snapshot]] in WritableManifest has invalid value {'model.my_project.my_model': {'database': 'dbt-test-env', 'schema': 'dbt_dbeatty', 'name': 'my_model', 'resource_type': 'model', 'package_name': 'my_project', 'path': 'my_model.sql', 'original_file_path': 'models/my_model.sql', 'unique_id': 'model.my_project.my_model', 'fqn': ['my_project', 'my_model'], 'alias': 'my_model', 'checksum': {'name': 'sha256', 'checksum': '50e02ae0751177fd4c789314d2a34e34118cdd04021decaffd5d2f7126096f8f'}, 'config': {'enabled': True, 'alias': None, 'schema': None, 'database': None, 'tags': [], 'meta': {}, 'materialized': 'incremental', 'incremental_strategy': None, 'persist_docs': {}, 'quoting': {}, 'column_types': {}, 'full_refresh': None, 'unique_key': 'id', 'on_schema_change': 'ignore', 'grants': {}, 'packages': [], 'docs': {'show': True, 'node_color': None}, 'contract': {'enforced': True, 'alias_types': False}, 'post-hook': [], 'pre-hook': []}, 'tags': [], 'description': '', 'columns': {}, 'meta': {}, 'docs': {'show': True, 'node_color': None}, 'patch_path': 'my_project://models/_models.yml', 'build_path': 'dbt-artifacts/run/my_project/models/my_model.sql', 'deferred': False, 'unrendered_config': {'contract': {'enforced': True, 'alias_types': False}, 'materialized': 'incremental', 'unique_key': 'id'}, 'created_at': 1723661777.4237618, 'relation_name': '`dbt-test-env`.`dbt_dbeatty`.`my_model`', 'raw_code': '{{\n    config(\n        materialized=\'incremental\',\n        unique_key=\'id\',\n    )\n}}\n\nselect 1 as id, 1 as my_number\nfrom {{ ref("dual") }}', 'language': 'sql', 'refs': [{'package': None, 'name': 'dual', 'version': None}], 'sources': [], 'metrics': [], 'depends_on': {'macros': [], 'nodes': ['model.my_project.dual']}, 'compiled_path': 'dbt-artifacts/compiled/my_project/models/my_model.sql', 'compiled': True, 'compiled_code': '\n\nselect 1 as id, 1 as my_number\nfrom `dbt-test-env`.`dbt_dbeatty`.`dual`', 'extra_ctes_injected': True, 'extra_ctes': []}, 'model.my_project.dual': {'database': 'dbt-test-env', 'schema': 'dbt_dbeatty', 'name': 'dual', 'resource_type': 'model', 'package_name': 'my_project', 'path': 'dual.sql', 'original_file_path': 'models/dual.sql', 'unique_id': 'model.my_project.dual', 'fqn': ['my_project', 'dual'], 'alias': 'dual', 'checksum': {'name': 'sha256', 'checksum': 'b2f9b44c290052deae372a348c684e745f3a6d9101b63cd8bd494801fc3935b0'}, 'config': {'enabled': True, 'alias': None, 'schema': None, 'database': None, 'tags': [], 'meta': {}, 'materialized': 'view', 'incremental_strategy': None, 'persist_docs': {}, 'quoting': {}, 'column_types': {}, 'full_refresh': None, 'unique_key': None, 'on_schema_change': 'ignore', 'grants': {}, 'packages': [], 'docs': {'show': True, 'node_color': None}, 'post-hook': [], 'pre-hook': []}, 'tags': [], 'description': '', 'columns': {}, 'meta': {}, 'docs': {'show': True, 'node_color': None}, 'patch_path': None, 'build_path': 'dbt-artifacts/run/my_project/models/dual.sql', 'deferred': False, 'unrendered_config': {}, 'created_at': 1723661777.38413, 'relation_name': '`dbt-test-env`.`dbt_dbeatty`.`dual`', 'raw_code': 'select 1 as dummy', 'language': 'sql', 'refs': [], 'sources': [], 'metrics': [], 'depends_on': {'macros': [], 'nodes': []}, 'compiled_path': 'dbt-artifacts/compiled/my_project/models/dual.sql', 'compiled': True, 'compiled_code': 'select 1 as dummy', 'extra_ctes_injected': True, 'extra_ctes': []}}
18:58:11  Traceback (most recent call last):
  File "<string>", line 16, in __mashumaro_from_dict__
  File "<string>", line 16, in <dictcomp>
  File "<string>", line 27, in __unpack_union_WritableManifest_nodes__d899146e00c44efcaa746ad0f9e68d10
mashumaro.exceptions.InvalidFieldValue: Field "nodes" of type Union[Seed, Analysis, SingularTest, HookNode, Model, SqlOperation, GenericTest, Snapshot] in WritableManifest has invalid value {'database': 'dbt-test-env', 'schema': 'dbt_dbeatty', 'name': 'my_model', 'resource_type': 'model', 'package_name': 'my_project', 'path': 'my_model.sql', 'original_file_path': 'models/my_model.sql', 'unique_id': 'model.my_project.my_model', 'fqn': ['my_project', 'my_model'], 'alias': 'my_model', 'checksum': {'name': 'sha256', 'checksum': '50e02ae0751177fd4c789314d2a34e34118cdd04021decaffd5d2f7126096f8f'}, 'config': {'enabled': True, 'alias': None, 'schema': None, 'database': None, 'tags': [], 'meta': {}, 'materialized': 'incremental', 'incremental_strategy': None, 'persist_docs': {}, 'quoting': {}, 'column_types': {}, 'full_refresh': None, 'unique_key': 'id', 'on_schema_change': 'ignore', 'grants': {}, 'packages': [], 'docs': {'show': True, 'node_color': None}, 'contract': {'enforced': True, 'alias_types': False}, 'post-hook': [], 'pre-hook': []}, 'tags': [], 'description': '', 'columns': {}, 'meta': {}, 'docs': {'show': True, 'node_color': None}, 'patch_path': 'my_project://models/_models.yml', 'build_path': 'dbt-artifacts/run/my_project/models/my_model.sql', 'deferred': False, 'unrendered_config': {'contract': {'enforced': True, 'alias_types': False}, 'materialized': 'incremental', 'unique_key': 'id'}, 'created_at': 1723661777.4237618, 'relation_name': '`dbt-test-env`.`dbt_dbeatty`.`my_model`', 'raw_code': '{{\n    config(\n        materialized=\'incremental\',\n        unique_key=\'id\',\n    )\n}}\n\nselect 1 as id, 1 as my_number\nfrom {{ ref("dual") }}', 'language': 'sql', 'refs': [{'package': None, 'name': 'dual', 'version': None}], 'sources': [], 'metrics': [], 'depends_on': {'macros': [], 'nodes': ['model.my_project.dual']}, 'compiled_path': 'dbt-artifacts/compiled/my_project/models/my_model.sql', 'compiled': True, 'compiled_code': '\n\nselect 1 as id, 1 as my_number\nfrom `dbt-test-env`.`dbt_dbeatty`.`dual`', 'extra_ctes_injected': True, 'extra_ctes': []}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/cli/requires.py", line 138, in wrapper
    result, success = func(*args, **kwargs)
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/cli/requires.py", line 101, in wrapper
    return func(*args, **kwargs)
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/cli/requires.py", line 218, in wrapper
    return func(*args, **kwargs)
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/cli/requires.py", line 247, in wrapper
    return func(*args, **kwargs)
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/cli/requires.py", line 294, in wrapper
    return func(*args, **kwargs)
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/cli/requires.py", line 332, in wrapper
    return func(*args, **kwargs)
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/cli/main.py", line 199, in build
    task = BuildTask(
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/task/build.py", line 81, in __init__
    super().__init__(args, config, manifest)
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/task/run.py", line 312, in __init__
    super().__init__(args, config, manifest)
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/task/runnable.py", line 85, in __init__
    self.previous_state = PreviousState(
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/contracts/state.py", line 40, in __init__
    writable_manifest = WritableManifest.read_and_check_versions(str(manifest_path))
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/artifacts/schemas/base.py", line 135, in read_and_check_versions
    return cls.upgrade_schema_version(data)
  File "/Users/dbeatty/projects/environments/dbt_1.8/lib/python3.10/site-packages/dbt/artifacts/schemas/manifest/v12/manifest.py", line 183, in upgrade_schema_version
    return cls.from_dict(data)
  File "<string>", line 4, in __mashumaro_from_dict__
  File "<string>", line 18, in __mashumaro_from_dict__
mashumaro.exceptions.InvalidFieldValue: Field "nodes" of type Mapping[str, Union[Seed, Analysis, SingularTest, HookNode, Model, SqlOperation, GenericTest, Snapshot]] in WritableManifest has invalid value {'model.my_project.my_model': {'database': 'dbt-test-env', 'schema': 'dbt_dbeatty', 'name': 'my_model', 'resource_type': 'model', 'package_name': 'my_project', 'path': 'my_model.sql', 'original_file_path': 'models/my_model.sql', 'unique_id': 'model.my_project.my_model', 'fqn': ['my_project', 'my_model'], 'alias': 'my_model', 'checksum': {'name': 'sha256', 'checksum': '50e02ae0751177fd4c789314d2a34e34118cdd04021decaffd5d2f7126096f8f'}, 'config': {'enabled': True, 'alias': None, 'schema': None, 'database': None, 'tags': [], 'meta': {}, 'materialized': 'incremental', 'incremental_strategy': None, 'persist_docs': {}, 'quoting': {}, 'column_types': {}, 'full_refresh': None, 'unique_key': 'id', 'on_schema_change': 'ignore', 'grants': {}, 'packages': [], 'docs': {'show': True, 'node_color': None}, 'contract': {'enforced': True, 'alias_types': False}, 'post-hook': [], 'pre-hook': []}, 'tags': [], 'description': '', 'columns': {}, 'meta': {}, 'docs': {'show': True, 'node_color': None}, 'patch_path': 'my_project://models/_models.yml', 'build_path': 'dbt-artifacts/run/my_project/models/my_model.sql', 'deferred': False, 'unrendered_config': {'contract': {'enforced': True, 'alias_types': False}, 'materialized': 'incremental', 'unique_key': 'id'}, 'created_at': 1723661777.4237618, 'relation_name': '`dbt-test-env`.`dbt_dbeatty`.`my_model`', 'raw_code': '{{\n    config(\n        materialized=\'incremental\',\n        unique_key=\'id\',\n    )\n}}\n\nselect 1 as id, 1 as my_number\nfrom {{ ref("dual") }}', 'language': 'sql', 'refs': [{'package': None, 'name': 'dual', 'version': None}], 'sources': [], 'metrics': [], 'depends_on': {'macros': [], 'nodes': ['model.my_project.dual']}, 'compiled_path': 'dbt-artifacts/compiled/my_project/models/my_model.sql', 'compiled': True, 'compiled_code': '\n\nselect 1 as id, 1 as my_number\nfrom `dbt-test-env`.`dbt_dbeatty`.`dual`', 'extra_ctes_injected': True, 'extra_ctes': []}, 'model.my_project.dual': {'database': 'dbt-test-env', 'schema': 'dbt_dbeatty', 'name': 'dual', 'resource_type': 'model', 'package_name': 'my_project', 'path': 'dual.sql', 'original_file_path': 'models/dual.sql', 'unique_id': 'model.my_project.dual', 'fqn': ['my_project', 'dual'], 'alias': 'dual', 'checksum': {'name': 'sha256', 'checksum': 'b2f9b44c290052deae372a348c684e745f3a6d9101b63cd8bd494801fc3935b0'}, 'config': {'enabled': True, 'alias': None, 'schema': None, 'database': None, 'tags': [], 'meta': {}, 'materialized': 'view', 'incremental_strategy': None, 'persist_docs': {}, 'quoting': {}, 'column_types': {}, 'full_refresh': None, 'unique_key': None, 'on_schema_change': 'ignore', 'grants': {}, 'packages': [], 'docs': {'show': True, 'node_color': None}, 'post-hook': [], 'pre-hook': []}, 'tags': [], 'description': '', 'columns': {}, 'meta': {}, 'docs': {'show': True, 'node_color': None}, 'patch_path': None, 'build_path': 'dbt-artifacts/run/my_project/models/dual.sql', 'deferred': False, 'unrendered_config': {}, 'created_at': 1723661777.38413, 'relation_name': '`dbt-test-env`.`dbt_dbeatty`.`dual`', 'raw_code': 'select 1 as dummy', 'language': 'sql', 'refs': [], 'sources': [], 'metrics': [], 'depends_on': {'macros': [], 'nodes': []}, 'compiled_path': 'dbt-artifacts/compiled/my_project/models/dual.sql', 'compiled': True, 'compiled_code': 'select 1 as dummy', 'extra_ctes_injected': True, 'extra_ctes': []}}
dbeatty10 commented 1 month ago

@7pandeys I don't know all the details leading to this error message, but it looks to me that a big reason is that dbt model contracts weren't introduced until dbt v1.5.

So I'd suggest removing all the contract-related config when you are running dbt v1.4 and reproduce your dbt-artifacts folder. This should allow you to upgrade without seeing this particular error.

Since model contracts weren't introduced until v1.4 and there's a path forward for you to resolve this, I'm going to close this as "not planned".