Compiling and running datafusion-comet for AWS EMR version emr-6.15.0 with Spark 3.4.1 won't work
how to reproduce the issue
scala> (0 until 10).toDF("a").write.mode("overwrite").parquet("/tmp/test")
scala> spark.read.parquet("/tmp/test").createOrReplaceTempView("t1")
scala> spark.sql("select * from t1 where a > 5").show
scala.MatchError: 8 (of class java.lang.Integer)
at org.apache.comet.shims.ShimCometScanExec.$anonfun$newFileScanRDD$1(ShimCometScanExec.scala:73)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
the cause of the problem
the file spark-sql_2.12-3.4.1-amzn-2.jar is a custom version of spark and contains the class org.apache.spark.sql.execution.datasources.FileScanRDD with 2 constructs, one with 6 parameters and the second with 8 parameters.
Steps to reproduce
scala> (0 until 10).toDF("a").write.mode("overwrite").parquet("/tmp/test")
scala> spark.read.parquet("/tmp/test").createOrReplaceTempView("t1")
scala> spark.sql("select * from t1 where a > 5").show
scala.MatchError: 8 (of class java.lang.Integer)
at org.apache.comet.shims.ShimCometScanExec.$anonfun$newFileScanRDD$1(ShimCometScanExec.scala:73)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
Describe the bug
Compiling and running datafusion-comet for AWS EMR version emr-6.15.0 with Spark 3.4.1 won't work
how to reproduce the issue
the cause of the problem
the file
spark-sql_2.12-3.4.1-amzn-2.jar
is a custom version of spark and contains the classorg.apache.spark.sql.execution.datasources.FileScanRDD
with 2 constructs, one with 6 parameters and the second with 8 parameters.Steps to reproduce
Expected behavior
No response
Additional context
No response