dbt-labs / dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
https://getdbt.com
Apache License 2.0
391 stars 221 forks source link

Runtime Error from "show table extended in database like '*'" #1106

Open tinolyuu opened 5 days ago

tinolyuu commented 5 days ago

Is this a new bug in dbt-spark?

Current Behavior

Got the following errors when running models. It happens when some tables in the same database get deleted during dbt running the show table extended in database like '*' query.

17:37:37.821971 [debug] [ThreadPool]: Spark adapter: Runtime Error
  [TABLE_OR_VIEW_NOT_FOUND] The table or view `database`.`table` cannot be found. Verify the spelling and correctness of the schema and catalog.
  If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
  To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS.

Expected Behavior

Try more attempts before failing the query only if it gets TABLE_OR_VIEW_NOT_FOUND error.

Steps To Reproduce

Run a dbt model, instantly delete a table from the database when it outputs show table extended in database like '*'.

Relevant log output

No response

Environment

- OS: Ubuntu
- Python: 3.10.12
- dbt-core: 1.8.6
- dbt-spark: 1.8.0

Additional Context

In the previous version, we modified list_relations_without_caching to try some attempts before failing the query.

attempts = n
while True:
    attempts -= 1
    try:
        ...
    except dbt.exceptions.DbtRuntimeError as e:
        ...
    except Exception as e:
        logger.error(
            f"Error while retrieving information about {schema_relation}: {e}"
                 )
        logger.error(f"Macro list_relations_without_caching failed. Remaining attempts: {attempts}")
        if attempts > 0:
            continue
        elif any(msg in str(e) for msg in TABLE_OR_VIEW_NOT_FOUND_MESSAGES):
            raise e
        else:
            return []

But this workaround doesn't work now as more exceptions are being caught as DbtRuntimeError in the later version. We couldn't catch the exact TABLE_OR_VIEW_NOT_FOUND error.

amychen1776 commented 4 days ago

@tinolyuu Would you be able to explain your use case in which you are dropping tables outside of dbt as dbt is executing its metadata queries? This seems like an anti-pattern for what we expect

tinolyuu commented 4 days ago

@tinolyuu Would you be able to explain your use case in which you are dropping tables outside of dbt as dbt is executing its metadata queries? This seems like an anti-pattern for what we expect

We may have multiple jobs running at the same time, so a dbt process may accidentally perform drop table query while another one running show table extended in database like '*'.