Open tgravescs opened 4 years ago
the CartesianProductExec is disabled by default in the plugin so this probably isn't super high priority, but it still is an issue, here is the exception:
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 30.0 failed 1 times, most recent failure: Lost task 0.0 in stage 30.0 (TID 23, ip-10-59-250-3.us-west-2.compute.internal, executor driver): java.lang.ArrayIndexOutOfBoundsException: 0
E at ai.rapids.cudf.Table.<init>(Table.java:52)
E at com.nvidia.spark.rapids.GpuColumnVector.from(GpuColumnVector.java:253)
E at org.apache.spark.sql.rapids.execution.GpuBroadcastNestedLoopJoinExecBase$.$anonfun$innerLikeJoin$2(GpuBroadcastNestedLoopJoinExec.scala:107)
E at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
E at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
E at org.apache.spark.sql.rapids.execution.GpuBroadcastNestedLoopJoinExecBase$.withResource(GpuBroadcastNestedLoopJoinExec.scala:92)
E at org.apache.spark.sql.rapids.execution.GpuBroadcastNestedLoopJoinExecBase$.$anonfun$innerLikeJoin$1(GpuBroadcastNestedLoopJoinExec.scala:106)
Describe the bug A clear and concise description of what the bug is.
Steps/Code to reproduce bug Please provide a list of steps or a code sample to reproduce the issue. Avoid posting private or sensitive data.
Expected behavior A clear and concise description of what you expected to happen.
Environment details (please complete the following information)
Additional context Add any other context about the problem here.
Running on databricks the test fail: FAILED src/main/python/join_test.py::test_cartesean_join_special_case[String][IGNORE_ORDER({'local': True})] 13:53:07 FAILED src/main/python/join_test.py::test_cartesean_join_special_case[Byte][IGNORE_ORDER({'local': True})] 13:53:07 FAILED src/main/python/join_test.py::test_cartesean_join_special_case[Short][IGNORE_ORDER({'local': True})] 13:53:07 FAILED src/main/python/join_test.py::test_cartesean_join_special_case[Integer][IGNORE_ORDER({'local': True})] 13:53:07 FAILED src/main/python/join_test.py::test_cartesean_join_special_case[Long][IGNORE_ORDER({'local': True})] 13:53:07 FAILED src/main/python/join_test.py::test_cartesean_join_special_case[Boolean][IGNORE_ORDER({'local': True})] 13:53:07 FAILED src/main/python/join_test.py::test_cartesean_join_special_case[Date][IGNORE_ORDER({'local': True})] 13:53:07 FAILED src/main/python/join_test.py::test_cartesean_join_special_case[Timestamp][IGNORE_ORDER({'local': True})] 13:53:07 FAILED src/main/python/join_test.py::test_cartesean_join_special_case[Float][IGNORE_ORDER({'local': True}), INCOMPAT] 13:53:07 FAILED src/main/python/join_test.py::test_cartesean_join_special_case[Double][IGNORE_ORDER({'local': True}), INCOMPAT]
One of the stack traces had ArrayIndexOutOfBoundsException so I'm wondering if there is a difference in the Spark version.