NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
822 stars 235 forks source link

[BUG] 24.12 Precommit fails with wrong number of arguments in `GpuDataSource` #11684

Closed kuhushukla closed 2 weeks ago

kuhushukla commented 3 weeks ago

Describe the bug Precommit build fails with the following error for PRs targeted to 24.12

/home/runner/work/spark-rapids/spark-rapids/sql-plugin/src/main/spark332db/scala/org/apache/spark/sql/rapids/shims/GpuDataSource.scala:87: wrong number of arguments for pattern org.apache.spark.sql.execution.datasources.LogicalRelation(relation: 
org.apache.spark.sql.sources.BaseRelation, output: Seq[org.apache.spark.sql.catalyst.expressions.AttributeReference], catalogTable: Option[org.apache.spark.sql.catalyst.catalog.CatalogTable], isStreaming: Boolean, stream: Option[org.apache.spark.sql.connector.read.streaming.SparkDataStream])

This looks related to Spark 4. 0 changes.

I am unable to reproduce this locally -- will update when I can

Steps/Code to reproduce bug Please provide a list of steps or a code sample to reproduce the issue. Avoid posting private or sensitive data.

Expected behavior A clear and concise description of what you expected to happen.

Environment details (please complete the following information)

Additional context Add any other context about the problem here.

gerashegalov commented 3 weeks ago

To repro locally run:

JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64  mvn verify -f scala2.13  -Dbuildver=400  -DskipTests

Spark 4.0.0 needs dedicated GpuDataSource Shimming since https://github.com/apache/spark/pull/48676/files#diff-4b3394a9d90b1036245e84e844590e5a6e8dd980c21e535fa95eeb0e239cfb52R44