dbt-labs / dbt-redshift

dbt-redshift contains all of the code enabling dbt to work with Amazon Redshift
https://getdbt.com
Apache License 2.0
95 stars 58 forks source link

[Bug] dbt build fails when flag restrict_direct_pg_catalog_access set True #914

Open dave-tomkinson opened 1 week ago

dave-tomkinson commented 1 week ago

Is this a new bug in dbt-redshift?

Current Behavior

When I switch dbt cloud to versionless and I build with no flags set then the build works, but with Warnings to set the flag restrict_direct_pg_catalog_access

When I set the flag restrict_direct_pg_catalog_access to True the build fails, with a number of tables (probably around 10%) Failing

flags:
  restrict_direct_pg_catalog_access: True

Mostly with one of these Errors:

Redshift adapter: Redshift error: column "id" of relation "<RELATION_NAME>" already exists
Redshift adapter: Redshift error: syntax error at or near ")" in context 

Setting it to False or removing the flag means the build works again.

Expected Behavior

The build to succeed with no Warnings.

Steps To Reproduce

In dbt cloud with the project version set to versionless Set the following flag in dbt_project.yml

flags:
  restrict_direct_pg_catalog_access: True

Run dbt build

Relevant log output

No response

Environment

- OS: dbt cloud
- Python: dbt cloud
- dbt-core: versionless - 'Running latest version'
- dbt-redshift: redshift=1.9.0-post5+04bd2c07f526e226c80f6527e8b0d49033bbd639

Additional Context

image In the above

  1. Added the flag set to True in the dbt_project.yml
  2. Removed the flag completely
  3. Added the flag back in set to False (to check it wasn't due to mis-typed config)
  4. Change the flag back to True... you know just to check again.
mikealfare commented 2 days ago

@dave-tomkinson Thanks for the bug report. Would you be able to provide a model with column names and configs for one of the models that fails? You can redact the logic that goes into each column if you prefer. I need to know things like whether the columns are quoted or not, whether they are lowercase or uppercase, etc. I suspect that the SDK returns columns that do not match the case, or which are/aren't quoted as expected. This could cause dbt to attempt to add a column that's already there because of a scenario like "id" <> "Id". I'll setup a skeleton of a reproduction in the meantime, but nothing beats real data. Thanks in advance!

One additional note, I'm making the assumption that this is an incremental model with on_schema_change set to either append_new_columns or sync_all_columns, hence the "column already exists" error. Please let me know if that's incorrect.