NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
44 stars 34 forks source link

Split job and stage level aggregated metrics into different files #1050

Closed parthosa closed 1 month ago

parthosa commented 1 month ago

Fixes #1017. Please see issue description for more details about sub tasks.

Changes:

  1. Splits the job_+_stage_level_aggregated_task_metrics.csv generated by the Profiling tool into two separate files.
  2. Update spill heuristics method to use the new stage level aggregated metrics.
  3. Updated prediction code to handle the separate metrics. This will be refactored later when updating the code with latest changes.