NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
56 stars 38 forks source link

Add qualification support for Photon jobs in the Python Tool #1409

Closed parthosa closed 1 week ago

parthosa commented 3 weeks ago

Issue #251.

This PR introduces support for recommending Photon applications, using a separate strategy for categorizing them:

Additionally, the Small category for Photon applications is different from that of Spark-based applications:

Note

Output

  {
    "appId": "app-20240818062343-0000",
    "appName": "Databricks Shell",
    "eventLog": "file:/path/to/log/photon_eventlog",
    "sparkRuntime": "PHOTON",
    "estimatedGpuSpeedupCategory": "Not Recommended"
  }

Changes

Enhancements and New Features:

Refactoring and Utility Improvements:

Tests:

Follow Up

parthosa commented 2 weeks ago

From offline discussions with @amahussein and @leewyang, moving the detection of runtime (Spark/Photon/Velox) to Scala.

This PR will be refactored afterwards.

parthosa commented 1 week ago

@amahussein

Is there another followup PR to change the QualX module to read the app_meta.json to decide whether this app is photon or not?

I am concerned about how we can troubleshoot and validate app_meta.json. the wrapper reads the autotuner's output and copy some of the fields to that file in the upper level. With this PR, we are adding a new field derived from python logic. Later, we will hit a question "Where does each field come from?" (this becomes even more challenging if fields might be overridden by Python wrapper).