NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
44 stars 34 forks source link

[FEA] Improve or remove AutoTuner comments #1027

Open tgravescs opened 1 month ago

tgravescs commented 1 month ago

Is your feature request related to a problem? Please describe. The auto tuner recommends configs and then makes comments on them. Many of the comments are just "was not set".

That really isn't useful to the user. We should either explain why to set it or perhaps just remove the comment.

 'spark.rapids.memory.pinnedPool.size' was not set.
- 'spark.rapids.shuffle.multiThreaded.reader.threads' was not set.
- 'spark.rapids.shuffle.multiThreaded.writer.threads' was not set.
- 'spark.rapids.sql.batchSizeBytes' was not set.
- 'spark.rapids.sql.concurrentGpuTasks' was not set.
amahussein commented 1 month ago

At some point, we received a request to list any configuration that was not initially set. I guess that in the Qualification's AutoTuner context it could be cumbersome to do that for all GPU-related configs. However, there will be a need to show the diff (configs introduced, configs removed and why, configs that were updated)

Anyway, that can be an easy fix. Autotuner has a config to list only items that were modified

  // When enabled, the profiler recommendations should only include updated settings.
  private var filterByUpdatedPropertiesEnabled: Boolean = true

AutoTuner append some of those comments automatically. We can add a config property to enable/disable this automation.

  def appendRecommendation(key: String, value: String): Unit = {
    if (!skippedRecommendations.contains(key)) {
      val recomRecord = recommendations.getOrElseUpdate(key,
        new RecommendationEntry(key, getPropertyValue(key), None))
      if (value != null) {
        recomRecord.setRecommendedValue(value)
        if (recomRecord.original.isEmpty) {
          // add a comment that the value was missing in the cluster properties
          appendComment(s"'$key' was not set.")
        }
      }
    }
  }

And

  private def addDefaultComments(): Unit = {
    commentsForMissingProps.foreach {
      case (key, value) =>
        if (!skippedRecommendations.contains(key)) {
          appendComment(value)
        }
    }
  }