dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.88k stars 1.63k forks source link

[Bug] Seed properties .yml files compile twice and clash when seed directory is under model directory #10064

Open mbarnathan-os opened 6 months ago

mbarnathan-os commented 6 months ago

Is this a new bug in dbt-core?

Current Behavior

When specifying seed properties via a .yml file and the seed-paths directory is under the model-paths directory, the .yml file will be compiled twice and dbt will error out with "dbt found two schema.yml entries for the same resource" against the same file, even though there is no actual duplication. This doesn't happen if the seed is defined solely by csv; i.e. the yml file is absent.

Expected Behavior

Nesting seeds under the model directory is a supported use case per the docs, so I would expect the model parsing phase to detect the duplication and omit the seed a second time.

Steps To Reproduce

  1. model-paths: ["models"]
  2. seed-paths: ["models/seeds"]
  3. Add a .yml file in the seed path with at least one seed
  4. Add a csv file for the seed
  5. dbt parse yields "dbt found two schema.yml entries for the same resource" with the seed yml from step 3

Relevant log output

No response

Environment

- OS:
- Python:
- dbt:

Which database adapter are you using with dbt?

snowflake

Additional Context

No response

dbeatty10 commented 6 months ago

Thanks for reporting this @mbarnathan-os 👍

I was able to reproduce what you described. See details below.

### Reprex `dbt_project.yml` ```yaml name: "my_project" version: "1.0.0" config-version: 2 profile: "some_profile" model-paths: ["models"] seed-paths: ["models/seeds"] ``` ```shell mkdir -p models mkdir -p models/seeds ``` Create a seed file: ```shell cat < models/seeds/my_seed.csv id 1 EOF ``` Add an associated YAML file: ```shell cat < models/seeds/_seeds.yml seeds: - name: my_seed EOF ``` See that it never works when partial parsing is **disabled** (but may work it certain situations when it is enabled): ```shell dbt parse --no-partial-parse dbt list --no-partial-parse ``` #### Side note For me, I did an initial `dbt parse` prior to adding the YAML file, and dbt commands worked as long as partial parsing was enabled and I didn't `dbt clean` my partial parsing artifacts.
github-actions[bot] commented 6 days ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.