dbt-labs / dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
https://getdbt.com
Apache License 2.0
400 stars 227 forks source link

[ADAP-558] [Bug] PySpark session connection throws AnalysisException #781

Closed Fokko closed 1 year ago

Fokko commented 1 year ago

Is this a new bug in dbt-spark?

Current Behavior

When you run a session type connection (where Spark runs in the same Python process), when an error is made, a AnalysisException will be thrown. This isn't the case with a Thrift connection, and it does break the logic to catch exception in dbt-spark.

Expected Behavior

When a SQL statement fails, an AnalysisException should be thrown.

Steps To Reproduce

  1. Setup a local session connection
  2. Create an incorrect syntax
  3. dbt run
  4. See the process crash

Relevant log output

➜  dbt-tabular git:(fd-fix) ✗ dbt run
09:56:58  Running with dbt=1.6.0-b1
09:56:59  Found 4 models, 3 tests, 0 snapshots, 0 analyses, 356 macros, 1 operation, 0 seed files, 0 sources, 0 exposures, 0 metrics, 0 groups
09:56:59  
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/05/17 11:57:00 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
09:57:04  
09:57:04  Finished running  in 0 hours 0 minutes and 4.92 seconds (4.92s).
09:57:04  Encountered an error:
SHOW TABLE EXTENDED is not supported for v2 tables.;
ShowTableExtended *, [namespace#6, tableName#7, isTemporary#8, information#9]
+- ResolvedNamespace org.apache.iceberg.spark.SparkCatalog@7d95f416, [dbt_tabular]

09:57:04  Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/cli/requires.py", line 86, in wrapper
    result, success = func(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/cli/requires.py", line 71, in wrapper
    return func(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/cli/requires.py", line 142, in wrapper
    return func(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/cli/requires.py", line 168, in wrapper
    return func(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/cli/requires.py", line 215, in wrapper
    return func(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/cli/requires.py", line 250, in wrapper
    return func(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/cli/main.py", line 566, in run
    results = task.run()
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/task/runnable.py", line 443, in run
    result = self.execute_with_hooks(selected_uids)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/task/runnable.py", line 408, in execute_with_hooks
    self.before_run(adapter, selected_uids)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/task/run.py", line 447, in before_run
    self.populate_adapter_cache(adapter, required_schemas)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/task/runnable.py", line 386, in populate_adapter_cache
    adapter.set_relations_cache(self.manifest)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/adapters/base/impl.py", line 462, in set_relations_cache
    self._relations_cache_for_schemas(manifest, required_schemas)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/adapters/base/impl.py", line 439, in _relations_cache_for_schemas
    for relation in future.result():
  File "/opt/homebrew/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/opt/homebrew/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/opt/homebrew/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/utils.py", line 464, in connected
    return func(*args, **kwargs)
  File "/Users/fokkodriesprong/Desktop/dbt-spark/dbt/adapters/spark/impl.py", line 199, in list_relations_without_caching
    show_table_extended_rows = self.execute_macro(LIST_RELATIONS_MACRO_NAME, kwargs=kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/adapters/base/impl.py", line 1044, in execute_macro
    result = macro_function(**kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/clients/jinja.py", line 330, in __call__
    return self.call_macro(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/clients/jinja.py", line 257, in call_macro
    return macro(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/runtime.py", line 763, in __call__
    return self._invoke(arguments, autoescape)
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/runtime.py", line 777, in _invoke
    rv = self._func(*arguments)
  File "<template>", line 21, in macro
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/sandbox.py", line 393, in call
    return __context.call(__obj, *args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/runtime.py", line 298, in call
    return __obj(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/clients/jinja.py", line 330, in __call__
    return self.call_macro(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/clients/jinja.py", line 257, in call_macro
    return macro(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/runtime.py", line 763, in __call__
    return self._invoke(arguments, autoescape)
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/runtime.py", line 777, in _invoke
    rv = self._func(*arguments)
  File "<template>", line 33, in macro
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/sandbox.py", line 393, in call
    return __context.call(__obj, *args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/runtime.py", line 298, in call
    return __obj(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/clients/jinja.py", line 330, in __call__
    return self.call_macro(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/clients/jinja.py", line 257, in call_macro
    return macro(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/runtime.py", line 763, in __call__
    return self._invoke(arguments, autoescape)
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/runtime.py", line 777, in _invoke
    rv = self._func(*arguments)
  File "<template>", line 52, in macro
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/sandbox.py", line 393, in call
    return __context.call(__obj, *args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/jinja2/runtime.py", line 298, in call
    return __obj(*args, **kwargs)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/adapters/base/impl.py", line 290, in execute
    return self.connections.execute(sql=sql, auto_begin=auto_begin, fetch=fetch, limit=limit)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/adapters/sql/connections.py", line 147, in execute
    _, cursor = self.add_query(sql, auto_begin)
  File "/opt/homebrew/lib/python3.9/site-packages/dbt/adapters/sql/connections.py", line 81, in add_query
    cursor.execute(sql, bindings)
  File "/Users/fokkodriesprong/Desktop/dbt-spark/dbt/adapters/spark/session.py", line 212, in execute
    self._cursor.execute(sql)
  File "/Users/fokkodriesprong/Desktop/dbt-spark/dbt/adapters/spark/session.py", line 116, in execute
    self._df = spark_session.sql(sql)
  File "/opt/homebrew/lib/python3.9/site-packages/pyspark/sql/session.py", line 1034, in sql
    return DataFrame(self._jsparkSession.sql(sqlQuery), self)
  File "/opt/homebrew/lib/python3.9/site-packages/py4j/java_gateway.py", line 1321, in __call__
    return_value = get_return_value(
  File "/opt/homebrew/lib/python3.9/site-packages/pyspark/sql/utils.py", line 196, in deco
    raise converted from None
pyspark.sql.utils.AnalysisException: SHOW TABLE EXTENDED is not supported for v2 tables.;
ShowTableExtended *, [namespace#6, tableName#7, isTemporary#8, information#9]
+- ResolvedNamespace org.apache.iceberg.spark.SparkCatalog@7d95f416, [dbt_tabular]

### Environment

```markdown
- OS: OSX
- Python: 3.9
- dbt-core: 1.6.0-b1
- dbt-spark: 1.6.0-b1

Additional Context

No response

dbeatty10 commented 1 year ago

Thank you for opening this issue and the associated PR @Fokko! 🏆

clintf1982 commented 1 year ago

Thank you for opening this issue, fixing of this issue will be helpful also to the company I work in. As you said, this issue happens in session type connection and not in thrift because the exception_handler in connections.py wraps the exception only in case of Thrift. However, I think the solution in the PR related to this issue is good.

tanweipeng commented 9 months ago

Just curious, SHOW TABLE EXTENDED is not supported for v2 tables.; this is something everyone facing, right?