dbt-labs / dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
https://getdbt.com
Apache License 2.0
406 stars 228 forks source link

[Feature] Support for Rapids (GPU) with Spark Thrift in DBT #1110

Open sjabber opened 1 month ago

sjabber commented 1 month ago

Is this a new bug in dbt-core?

Current Behavior

dbt throws error when spark thrift uses GPU (Rapids)

Expected Behavior

When attempting to use dbt run with Spark Thrift that utilizes the Rapids library, the following error occurs: org.apache.hive.service.cli.HiveSQLException: Error running query: java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: com/nvidia/spark/rapids/RuleNotFoundExprMeta at org.apache.spark.sql.hive.thriftserver.HiveThriftServerErrors$.runningQueryError(HiveThriftServerErrors.scala:46)

Steps To Reproduce

My Environment Details:

DBT version : 1.8.3 Spark version : 3.5.0 Rapids : rapids-4-spark_2.12-23.12.0.jar

When i make dbt model and run in spark thrift with rapids library, such error occurred

Relevant log output

java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: com/nvidia/spark/rapids/RuleNotFoundExprMeta

Environment

- OS: Rocky Linux 8.5 (Green Obsidian)
- Python: 3.8.13
- dbt-core:1.8.6
- dbt-spark: 1.8.0

Which database adapter are you using with dbt?

spark

Additional Context

I posted the issue somewhere else by mistake before, so I'm attaching it again. [dbt not support rapids spark] (https://github.com/dbt-labs/docs.getdbt.com/issues/6102#issue-2537608927)