Describe the bug
Currently, in AutoTuner, while recommending shuffle partitions, we read the existing value of spark.sql.shuffle.partitions and convert it to Integer. However, for databricks event logs this value may be auto. In that case, a NumberFormatException is thrown.
Detailed Output
| java.lang.NumberFormatException: For input string: "auto"
| at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_422]
| at java.lang.Integer.parseInt(Integer.java:580) ~[?:1.8.0_422]
| at java.lang.Integer.parseInt(Integer.java:615) ~[?:1.8.0_422]
| at scala.collection.immutable.StringLike.toInt(StringLike.scala:310) ~[scala-library-2.12.18.jar:?]
| at scala.collection.immutable.StringLike.toInt$(StringLike.scala:310) ~[scala-library-2.12.18.jar:?]
| at scala.collection.immutable.StringOps.toInt(StringOps.scala:33) ~[scala-library-2.12.18.jar:?]
| at com.nvidia.spark.rapids.tool.profiling.AutoTuner.recommendShufflePartitions(AutoTuner.scala:1008) ~[rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at com.nvidia.spark.rapids.tool.profiling.AutoTuner.calculateJobLevelRecommendations(AutoTuner.scala:723) ~[rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at com.nvidia.spark.rapids.tool.profiling.AutoTuner.getRecommendedProperties(AutoTuner.scala:1163) ~[rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at com.nvidia.spark.rapids.tool.tuning.QualificationAutoTuner.runAutoTuner(QualificationAutoTuner.scala:70) ~[rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at com.nvidia.spark.rapids.tool.tuning.TunerContext$$anonfun$tuneApplication$1.$anonfun$applyOrElse$1(TunerContext.scala:60) ~[rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at scala.util.Try$.apply(Try.scala:213) ~[scala-library-2.12.18.jar:?]
| at com.nvidia.spark.rapids.tool.tuning.TunerContext$$anonfun$tuneApplication$1.applyOrElse(TunerContext.scala:60) ~[rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at com.nvidia.spark.rapids.tool.tuning.TunerContext$$anonfun$tuneApplication$1.applyOrElse(TunerContext.scala:57) ~[rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at scala.PartialFunction$Lifted.apply(PartialFunction.scala:228) ~[scala-library-2.12.18.jar:?]
| at scala.PartialFunction$Lifted.apply(PartialFunction.scala:224) ~[scala-library-2.12.18.jar:?]
| at scala.Option.collect(Option.scala:432) ~[scala-library-2.12.18.jar:?]
| at com.nvidia.spark.rapids.tool.tuning.TunerContext.tuneApplication(TunerContext.scala:57) ~[rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at com.nvidia.spark.rapids.tool.qualification.Qualification.$anonfun$qualifyApp$6(Qualification.scala:184) ~[rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at scala.Option.foreach(Option.scala:407) ~[scala-library-2.12.18.jar:?]
| at com.nvidia.spark.rapids.tool.qualification.Qualification.$anonfun$qualifyApp$5(Qualification.scala:179) ~[rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) [scala-library-2.12.18.jar:?]
| at com.nvidia.spark.rapids.tool.qualification.AppSubscriber$.withSafeValidAttempt(AppSubscriber.scala:57) [rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at com.nvidia.spark.rapids.tool.qualification.Qualification.com$nvidia$spark$rapids$tool$qualification$Qualification$$qualifyApp(Qualification.scala:178) [rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at com.nvidia.spark.rapids.tool.qualification.Qualification$QualifyThread.run(Qualification.scala:50) [rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar:?]
| at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_422]
| at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_422]
| at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_422]
| at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_422]
| at java.lang.Thread.run(Thread.java:750) [?:1.8.0_422]
Expected Behavior
The exception should not be thrown
Describe the bug Currently, in AutoTuner, while recommending shuffle partitions, we read the existing value of
spark.sql.shuffle.partitions
and convert it toInteger
. However, for databricks event logs this value may beauto
. In that case, aNumberFormatException
is thrown.Detailed Output
Expected Behavior The exception should not be thrown
Additional Context Auto Optimized Shuffle - https://docs.databricks.com/en/optimizations/aqe.html#enable-auto-optimized-shuffle