This PR adds the initial changes needed in CLI to support distributed execution in the Qualification Tool CLI. It adds arguments to enable distributed mode and sets the stage for future implementation PRs.
Note:
An environment setup document will be shared internally.
Changes Overview
Extended RapidsJob: Introduced two subclasses—RapidsDistributedJob and RapidsLocalJob and a concrete class for the OnPrem platform.
Created a JarCmdArgs class to encapsulate all arguments needed to construct the JAR command.
Implemented the DistributedToolsConfig class, allowing configurations for distributed tools (like Spark properties) to be specified via the existing --tools_config_file option.
Fixes #1430.
This PR adds the initial changes needed in CLI to support distributed execution in the Qualification Tool CLI. It adds arguments to enable distributed mode and sets the stage for future implementation PRs.
Note:
Changes Overview
RapidsJob
: Introduced two subclasses—RapidsDistributedJob
andRapidsLocalJob
and a concrete class for theOnPrem
platform.JarCmdArgs
class to encapsulate all arguments needed to construct the JAR command.DistributedToolsConfig
class, allowing configurations for distributed tools (like Spark properties) to be specified via the existing--tools_config_file
option.CMD:
Sample Config File:
Details:
user_tools/src/spark_rapids_pytools/cloud_api/onprem.py
: Added a new classOnPremDistributedRapidsJob
and a methodcreate_distributed_submission_job
to support distributed RAPIDS jobs. [1] [2]user_tools/src/spark_rapids_pytools/rapids/rapids_job.py
: IntroducedRapidsDistributedJob
class and updated methods to handle distributed tool configurations. [1] [2] [3] [4]user_tools/src/spark_rapids_pytools/rapids/rapids_tool.py
: Added methods to get distributed tools configurations and submit distributed jobs. [1] [2]Enhancements to argument processing:
user_tools/src/spark_rapids_pytools/rapids/qualification.py
: Added methods to process distributed tools arguments. [1] [2]user_tools/src/spark_rapids_tools/cmdli/argprocessor.py
: UpdatedQualifyUserArgModel
andbuild_tools_args
to includedistributed_tools_enabled
. [1] [2]Platform class updates:
user_tools/src/spark_rapids_pytools/cloud_api/databricks_aws.py
,databricks_azure.py
,dataproc.py
,dataproc_gke.py
,emr.py
: Disabled pylint warnings for abstract methods. [1] [2] [3] [4] [5]Other improvements:
user_tools/src/spark_rapids_pytools/rapids/qualification.py
: Added a check to ensure the DataFrame is not empty before accessing it.user_tools/src/spark_rapids_tools/cmdli/tools_cli.py
: Added a new parameterdistributed
to thequalification
function.