Closed maksym-dibrov-solidgate closed 4 months ago
hi @maksym-dibrov-solidgate thank you writing this up. I agree that there's something seriously wonky about how dbt-redshift handles transactions after version 1.5
.
The predominant reason we cannot support user's defined transactions is due to an limitation of the redshift driver. see https://github.com/aws/amazon-redshift-python-driver/issues/162. Ultimately, I feel that until that limitation is addressed, we cannot confidently address transactions in dbt without significant additional investment.
In the meantime, I'd be very interested to know how your project is configured that causes it to fail.
Can you share more about how your project is configured such that it fails after one second?
More importantly, can you help me create a minimal reproducible example of the behavior. As in, starting with vanilla jaffle_shop, what's the least I'd have to modify to reproduce the behavior you are seeing?
Last I'm curious if the behavior is different if you specify autocommit: False
in your profiles.yml (more info on autocommit
)
@dataders thank you for replying to this issue! I will try to reproduce it on vanilla jaffle_shop and come back to this thread when I succeed
We see in the slack thread linked here that this is fixed in 1.7.7, closing. :D
@martynydbt the thread says it is fixed in 1.7.5. I updated to 1.7.5 but it still throws the same error, so these two issues should be unrelated
Tried with dbt-redshift==1.7.7 - still the same error
@maksym-dibrov-solidgate (on slack): tried with 1.4.9 - still the same issue. Table materialization style is "table" and I get OID does not exist errors when trying to read it during a dbt run
I figured out what the issue was and fixed it. Here is what happened:
We had two separate processes that both threw the same error OID does not exist...
. These processes were:
In the second case it looks that dbt renames and drops tables in two separate transactions when called with full-refresh
flag or if the model's materialization implies that it is dropped at each run (e.g. materialized in (view, table)
).
In the first case we started two dbt runs in two separate bash sessions. The run_A
was materializing table_A
in schema_for_tables
and run_B
was materializing table_A
in schema_for_tables
. Both these runs also executed an upstream dependency that was dropping view_A
in schema_for_views
and view_B
in schema_for_views
respectively. At some moment it appeared that run_A
was dropping view_A
to rebuild it from scratch, but the list_relations_without_caching
macro that was executed by run_B
was expecting view_A
would exist in schema_for_views
. Hence, it threw an error.
To me this problem looks like an issue with transactionality and considering that the transactionality of operations is determined by dbt, it looks that it can be fixed without intervention of AWS Redshift devs.
Thanks for clarifying @maksym-dibrov-solidgate! Discussed this with the team, this behavior is due to dbt not supporting parallel runs. Specifically we're limited by:
table_A
and view_A
). Currently dbt-redshift doesn't "know" about view_A
when it's executing table_A
so it's not possible for us to do so at this time. Unfortunately this just isn't functionality that we can support and have no plans to do so for now.
Thanks @colin-rogers-dbt! One additional detail: We strongly encourage folks to use late-binding views on Redshift, which should avoid this issue of concurrent cascade-dropping (table_A -> view_A
): https://docs.getdbt.com/reference/resource-configs/redshift-configs#late-binding-views
This is a problem in 1.8.4, not sure if this is closed because a spike is opened or misinformation, but this is still a problem. Same query as the top listed at the stort of this issue. Same same error could not open relation with OID 42762712
.
Here is one of my errors from today:
/ {"app": "dbt", "dbt_version": "1.8.4", "profile_name": "dev", "target_name": "env", "connection_name": "list_db_schema"} /\nselect\n table_catalog as database,\n table_name as name,\n table_schema as schema,\n 'table' as type\n from information_schema.tables\n where table_schema ilike '
It is ALWAYS the same type of query. SELECT .. FROM information_schema.tables...
We are seeing this a lot, we run in airflow with a bash operator starting dbt run ... --select model.
I actually think dbt-redshift knew about this error in 1.6 but for some reason did not fix it: https://github.com/dbt-labs/dbt-redshift/pull/569/files#diff-4ab8ac7de10abb4128d1534093662382aaeb4e8a99ed2f3f880a4d9720717d00. So this test is the reproduction of the issue. Maybe there is a way I as a dbt user can just add a bunch of retry logic to the information_schema queries, unti redshift-connector is fixed?
I think this is related to locking/transaction concurrency. This never occurs in our pre-prod environment which is scheduled ad-hoc, less concurrency. In airflow, if I want to reduce this error on dbt-redshift==1.8.4, I need to reduce the concurrent tasks, and reduce the threads in the dbt profile. Once I make those changes the error occurs less frequently intra-dag (but i still have multiple dags running at one time).
I could see why late-binding views is an option since that would reduce the information_schema queries performed by dbt, thus avoiding this concurrency/locking issue.
But I am using this is in production today receiving 100 alerts a day due to this behavior. Is another solution like pinning dbt-redshift=1.4.0 version prior to adding redshift-connector? Since this conversation seems to suggest that to fix dbt-redshift first we wait for redshift-connector to fix some things.
Update I was just notified that there is a forthcoming redshift patch P184 to be applied to Q3 2024 to fix could not open OID
Curious for those who are still seeing this issue, have you opened up an issue with AWS? This is the best way for us (dbt Labs) to advocate to the Redshift team that we need fixes for the Redshift driver.
@amychen1776 I haven't opened an issue with AWS. If you open one - I will be happy to give it thumbs up
@maksym-dibrov-solidgate I recommend doing it through AWS support and opening up your own ticket (so you wouldn't be able to thumbs up). In working with the Redshift team, that is how they have recommended flagging up issues like this.
Is this a regression in a recent version of dbt-core?
Current Behavior
After updating to DBT 1.1.0 -> 1.7.3 we started receiving errors either when running
dbt run
or when reading the tables that are fully-refreshed by DBT at the moment when they are read. This behaviour never appeared on DBT==1.1.0, so the issue seems to be with the newer version of DBT and the the DB itself. The error is:Expected/Previous Behavior
Issue never appeared on DBT==1.1.0 and we were able to read tables that are fully refreshed without problems.
Steps To Reproduce
1) ENV Pipfile
dbt-packages
2) CONFIG models: our_dbt_project: +enabled: true +full_refresh: false +meta: alerts_config: alert_fields: ["table", "description", "result_message", "test_parameters", "test_query", "test_results_sample"] 3) RUN pipenv run dbt run --debug --profiles-dir our_profile --project-dir our_project --models +tag:our_tag
Relevant log output
Environment
Which database adapter are you using with dbt?
redshift
Additional Context
Started a thread on this issue in slack's dbt-community: https://getdbt.slack.com/archives/CJARVS0RY/p1712935236552749