dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.61k stars 1.59k forks source link

[CT-2963] [Bug] CI breaking when a group is changed #8371

Closed graciegoheen closed 11 months ago

graciegoheen commented 1 year ago

Is this a new bug in dbt-core?

Current Behavior

Originally from a slack message from @matt-winkler

I'm picking up on what may be an issue with CI/CD when adjusting model access groups. In my case, I changed a group formerly known as hub to customer360 , and the CI test job gave me this Traceback: File "/venv/dbt-1.6.0-pre/lib/python3.8/site-packages/dbt/cli/main.py", line 207, in build results = task.run() File "/venv/dbt-1.6.0-pre/lib/python3.8/site-packages/dbt/task/runnable.py", line 468, in run result = self.execute_with_hooks(selected_uids) File "/venv/dbt-1.6.0-pre/lib/python3.8/site-packages/dbt/task/runnable.py", line 428, in execute_with_hooks self.before_run(adapter, selected_uids) File "/venv/dbt-1.6.0-pre/lib/python3.8/site-packages/dbt/task/run.py", line 448, in before_run self.defer_to_manifest(adapter, selected_uids) File "/venv/dbt-1.6.0-pre/lib/python3.8/site-packages/dbt/task/compile.py", line 122, in defer_to_manifest write_manifest(self.manifest, self.config.project_target_path) File "/venv/dbt-1.6.0-pre/lib/python3.8/site-packages/dbt/parser/manifest.py", line 1682, in write_manifest manifest.write(path) File "/venv/dbt-1.6.0-pre/lib/python3.8/site-packages/dbt/contracts/graph/manifest.py", line 960, in write self.writable_manifest().write(path) File "/venv/dbt-1.6.0-pre/lib/python3.8/site-packages/dbt/contracts/graph/manifest.py", line 941, in writable_manifest self.build_group_map() File "/venv/dbt-1.6.0-pre/lib/python3.8/site-packages/dbt/contracts/graph/manifest.py", line 936, in build_group_map group_map[node.group].append(node.unique_id) KeyError: 'hub'

There are two parts of this issue:

Expected Behavior

CI should be able to handle a change to the name of a group.

Steps To Reproduce

  1. run dbt compile on a project that has a group defined
  2. save the manifest somewhere
  3. change the group name everywhere it's referenced
  4. ALSO make a change to one of the models in that group (ex: add an order by 1 clause)
  5. run dbt build --select state:modified+ --defer --state path/to/earlier/manifest

Relevant log output

No response

Environment

- OS:
- Python:
- dbt: 1.6.0

Which database adapter are you using with dbt?

Snowflake

Additional Context

Above was running in dbt Cloud. State comparison to a previous manifest should be the same though, I think.

Depending on the scope of the fix, we can decide if it makes sense to backport for inclusion in v1.6.x.

jtcohen6 commented 1 year ago

Thanks Grace & Matt - added to the milestone!

Depending on the scope of the fix required here, we can decide if it makes sense to backport for inclusion in v1.6.x

graciegoheen commented 1 year ago

Added the backport label for v1.5.x and v1.6.x since groups were first introduced in v1.5.x

grindheim commented 11 months ago

I'm not sure if this is related or not (and thus if I should create a new issue), but the last few weeks we've been experiencing a similar issue when an exposure has been renamed/deleted, so a KeyError is returned when comparing to the previuos manifest. The solution has been to delete the previous manifest (and partial msg pack just in case).

The following error is from a run on September 14th:

12:08:41  Running with dbt=1.6.2
12:08:43  Registered adapter: databricks=1.6.3
12:10:20  Encountered an error:
'tine_bi://exposures/datasets/Mottak.yml'
12:10:20  Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 87, in wrapper
    result, success = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 72, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 143, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 172, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 219, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 246, in wrapper
    manifest = ManifestLoader.get_full_manifest(
  File "/usr/local/lib/python3.10/dist-packages/dbt/parser/manifest.py", line 315, in get_full_manifest
    manifest = loader.load()
  File "/usr/local/lib/python3.10/dist-packages/dbt/parser/manifest.py", line 491, in load
    self.parse_project(
  File "/usr/local/lib/python3.10/dist-packages/dbt/parser/manifest.py", line 658, in parse_project
    block = FileBlock(self.manifest.files[file_id])
KeyError: 'tine_bi://exposures/datasets/Mottak.yml'

While we got a similar error a week later on September 22nd, but this time related to a model:

13:54:21  Running with dbt=1.6.3
13:54:23  Registered adapter: databricks=1.6.4
13:56:19  Encountered an error:
'tine_bi://models/Domener/Produksjon/4_SRV/ProsesstyringMeieri/SRV_DimProsesstyringObjektlKode.yml'
13:56:19  Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 87, in wrapper
    result, success = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 72, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 143, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 172, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 219, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dbt/cli/requires.py", line 246, in wrapper
    manifest = ManifestLoader.get_full_manifest(
  File "/usr/local/lib/python3.10/dist-packages/dbt/parser/manifest.py", line 315, in get_full_manifest
    manifest = loader.load()
  File "/usr/local/lib/python3.10/dist-packages/dbt/parser/manifest.py", line 491, in load
    self.parse_project(
  File "/usr/local/lib/python3.10/dist-packages/dbt/parser/manifest.py", line 658, in parse_project
    block = FileBlock(self.manifest.files[file_id])
KeyError: 'tine_bi://models/Domener/Produksjon/4_SRV/ProsesstyringMeieri/SRV_DimProsesstyringObjektlKode.yml'
jtcohen6 commented 11 months ago

@grindheim That sounds like an error with partial parsing. Could I ask you to open a new issue for it, and include as many details / reproduction steps as possible?

graciegoheen commented 11 months ago

We're not able to reproduce this bug in dbt-core, so I'm closing it out - I've alerted our cloud team who is going to take a look :)

graciegoheen commented 11 months ago

Was able to finally reproduce!

The reason we weren’t able to reproduce this bug originally is that group name changes don’t actually count as something that is “modified” (@jtcohen6 is this expected behavior?) so nothing is selected / built when we did dbt build --select state:modified after only changing the group name.

But! If we changed another model in that group (we just added an order by clause to one of the models in the group) so that it would be picked up in the selection --select state:modified then we get the error.