In QualX, we fallback to legacy speedups if the metrics are unavailable (file not found or empty after preprocessing). This PR updates the prediction code to return a speedup of 1 for such apps and logs the reason for missing metrics.
We also introduce a column wasPredicted in per_app.csv as marker for apps that could not be predicted.
Affects:
predict()
Issue 2:
In QualX, CSV metrics from the profiling tool does not have app duration for incomplete applications. Qualification tool provides an estimated app duration for these.
This PR updates QuaX to replace the incorrect app duration in CSV metrics with the estimated duration from the qualification tool output.
Affects:
train(), compare() and predict()
Output:
CASE 1: No supported stages for all apps in the dataset(in this case, single eventlog)
WARNING spark_rapids_tools.tools.qualx.preprocess: Predicted speedup will be 1.0 for application_171615xxxx. Reason: No fully supported stages found.
WARNING spark_rapids_tools.tools.qualx.qualx_main: Predicted speedup will be 1.0 for dataset: qual_20240607xxxx. Check logs for details.
CASE 2: Metrics unavailable for all apps in the dataset(in this case, single eventlog)
WARNING spark_rapids_tools.tools.qualx.preprocess: Predicted speedup will be 1.0 for application_1715312822xxx. Reason: Empty feature tables found after preprocessing: application_information, sql_plan_metrics_for_application, job_+_stage_level_aggregated_task_metrics.
WARNING spark_rapids_tools.tools.qualx.qualx_main: Predicted speedup will be 1.0 for dataset: qual_202406071648xxx. Check logs for details.
CASE 3: Metrics unavailable for some apps in the dataset (cannot calculate exact reason, showing a broad reason):
WARNING spark_rapids_tools.tools.qualx.preprocess: Predicted speedup will be 1.0 for application_1715312822xxx, application_1715312822xxx. Reason: Missing features after preprocessing.
Fixes #1058,
Issues
Issue 1:
In QualX, we fallback to legacy speedups if the metrics are unavailable (file not found or empty after preprocessing). This PR updates the prediction code to return a speedup of 1 for such apps and logs the reason for missing metrics.
We also introduce a column
wasPredicted
inper_app.csv
as marker for apps that could not be predicted.Affects:
predict()
Issue 2:
In QualX, CSV metrics from the profiling tool does not have app duration for incomplete applications. Qualification tool provides an estimated app duration for these.
This PR updates QuaX to replace the incorrect app duration in CSV metrics with the estimated duration from the qualification tool output.
Affects:
train(), compare() and predict()
Output:
CASE 1: No supported stages for all apps in the dataset(in this case, single eventlog)
CASE 2: Metrics unavailable for all apps in the dataset(in this case, single eventlog)
CASE 3: Metrics unavailable for some apps in the dataset (cannot calculate exact reason, showing a broad reason):
Predicted CSV File:
per_app.csv