NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
50 stars 37 forks source link

Hook up the auto tuner in the qualification tool #1039

Closed tgravescs closed 4 months ago

tgravescs commented 4 months ago

This hooks up the metrics to the auto tuner to allow it to run in the Qualification tool on CPU event logs and output valid recommendations. Most of them are the same as the profiling tool except for the metrics we look at for spill.

I didn't want to conflict with other changes going on with combined qual/profiling tool so some code here was copied and should eventually be made common or base classes between profiling and qualification. For instance the getDataSourceInfo and some of the information in QualAppSummaryInfoProvider.

There are some TODO comments in the autotuner which will be addressed in other heuristic issues under https://github.com/NVIDIA/spark-rapids-tools/issues/907

The default gpu type for onprem was A100, I changed that to L4 but because we don't have speedup factors for l4 I added a new function to get the speedup factors from another one. This is meant to be temporary as we deprecate the use of these speedup factors anyway.

tgravescs commented 4 months ago

looking at test failures,