NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
44 stars 34 forks source link

Include entry points for internal usage #1075

Closed parthosa closed 4 weeks ago

parthosa commented 4 weeks ago

This PR adds remaining add-ons as part of QualX migrations.

Changes

  1. Add internal CLI for evaluate, compare
  2. Add a wrapper around predict to be used by the internal CLI
  3. Remove platform argument from external train CMD.
  4. Minor changes in docs.

Test

Following CMDs have been tested:

Public Interface:

spark_rapids qualification --platform <platform>  --eventlogs </path/to/logs> --estimation_model xgboost --tools_jar $SPARK_RAPIDS_TOOLS_JAR --verbose
spark_rapids prediction --qual_output </path/to/qual_2024xxx> --prof_output </path/to/qual_2024xxx> --output_folder qualx_runs
spark_rapids train --dataset </path/dataset.json> --model my_model.json --output_folder qualx_runs --n_trials 20

Internal Usage:

python qualx_main.py train --dataset </path/dataset.json> --model my_model.json --output_dir qualx_runs --n_trials 20
python qualx_main.py predict --platform onprem --profile </path/to/qual_2024xxx> --output_dir qualx_runs