NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
822 stars 235 forks source link

[BUG] AQE does not work with Spark 3.2 due to unrecognized GPU partitioning #3384

Closed andygrove closed 3 years ago

andygrove commented 3 years ago

Describe the bug

This CostBasedOptimizerSuite sample failure shows that Spark does not recognize GPU partitioning. This is probably not related to CBO and is probably showing up because this test does not disable AQE and most of our tests do.

- Avoid transition to GPU for trivial projection after CPU SMJ *** FAILED ***
  java.lang.IllegalStateException: Unexpected partitioning for coalesced shuffle read: gpuhashpartitioning(strings#7, 200)
  at org.apache.spark.sql.execution.adaptive.AQEShuffleReadExec.outputPartitioning$lzycompute(AQEShuffleReadExec.scala:86)
  at org.apache.spark.sql.execution.adaptive.AQEShuffleReadExec.outputPartitioning(AQEShuffleReadExec.scala:55)
  at org.apache.spark.sql.execution.exchange.ValidateRequirements$.$anonfun$validateInternal$4(ValidateRequirements.scala:61)
  at org.apache.spark.sql.execution.exchange.ValidateRequirements$.$anonfun$validateInternal$4$adapted(ValidateRequirements.scala:59)

Steps/Code to reproduce bug Run CostBasedOptimizerSuite

Expected behavior Tests should pass

Environment details (please complete the following information) N/A

Additional context N/A

andygrove commented 3 years ago

Root cause: https://issues.apache.org/jira/browse/SPARK-36666

andygrove commented 3 years ago

This is now resolved