Add shap command to internal CLI for debugging

This PR adds a shap command to the internal CLI to help explain a specific (per-sql) XGBoost prediction.

Usage:

python qualx_main.py shap --help

Example:

python --platform $PLATFORM \
--prediction_output /path/to/prediction/output \
--index 0
# --model $MODEL   # optional

# --index should be a numeric zero-based index pointing to a specific line (i.e. sqlID) in the `shap_values.csv` file.
# Each line in this file corresponds to the same line (sqlID) in the `per_sql.csv` file.

The output of the command looks like:

+-----+-------------------------------------------------+--------------+--------------+--------------------+--------------+-------------+-------------+-------------+-------------+-------------+-------------+-----------------+----------------+
|     | feature                                         |   shap_value |   model_rank |   model_shap_value |   train_mean |   train_std |   train_min |   train_25% |   train_50% |   train_75% |   train_max |   feature_value | out_of_range   |
|-----+-------------------------------------------------+--------------+--------------+--------------------+--------------+-------------+-------------+-------------+-------------+-------------+-------------+-----------------+----------------|
|   0 | executorCPUTime_mean                            |      -0.1192 |            0 |             0.1927 |      1.8e+03 |     5.1e+03 |     7.0e+01 |     2.6e+02 |     6.0e+02 |     1.2e+03 |     6.1e+04 |         4.9e+02 | False          |
|   1 | sw_bytesWrittenRatio                            |       0.0478 |            7 |             0.0220 |      9.6e-01 |     1.3e+00 |     1.2e-06 |     6.4e-03 |     3.2e-01 |     1.8e+00 |     1.2e+01 |         2.4e+00 | False          |
|   2 | executorDeserializeCPUTime_mean                 |      -0.0475 |            5 |             0.0237 |      6.7e+00 |     3.1e+00 |     2.1e+00 |     5.6e+00 |     6.2e+00 |     7.3e+00 |     3.7e+01 |         3.9e+01 | True           |
|   3 | sw_recordsWritten_sum                           |      -0.0339 |            1 |             0.0711 |      1.5e+09 |     3.8e+09 |     2.6e+02 |     1.7e+06 |     7.0e+07 |     8.7e+08 |     2.4e+10 |         1.2e+08 | False          |
...
| 106 | sqlOp_CommandResult                             |       0.0000 |          106 |             0.0000 |      0.0e+00 |     0.0e+00 |     0.0e+00 |     0.0e+00 |     0.0e+00 |     0.0e+00 |     0.0e+00 |         0.0e+00 | False          |
| 107 | sqlOp_WindowSort                                |       0.0000 |          107 |             0.0000 |      0.0e+00 |     0.0e+00 |     0.0e+00 |     0.0e+00 |     0.0e+00 |     0.0e+00 |     0.0e+00 |         0.0e+00 | False          |
+-----+-------------------------------------------------+--------------+--------------+--------------------+--------------+-------------+-------------+-------------+-------------+-------------+-------------+-----------------+----------------+
Shap base value: 0.4152
Shap values sum: -0.0120
Shap prediction: 0.4032
exp(prediction): 1.4965

Where:

the features are listed in order of importance (absolute value of shap_value), similar to a SHAP waterfall plot.
model_rank shows the feature importance rank on the training set.
model_shap_value shows the feature shap_value on the training set.
train_[mean|std|min|max] show the mean, standard deviation, min and max values of the feature in the training set.
train_[25%|50%|75%] show the feature value at the respective percentile in the training set.
feature_value shows the value of the feature used in prediction (for the indexed row/sqlID).
out_of_range indicates if the feature_value used in prediction was outside of the range of values seen in the training set.
Shap base value is the model's average prediction across the entire training set.
Shap values sum is the sum of the shap_value column for this indexed instance.
Shap prediction is the sum of Shap base value and Shap values sum, representing the model's predicted value.
exp(prediction) is the exponential of Shap prediction, which represents the predicted speedup (since the XGBoost model currently predicts log(speedup)).
the predicted speedup (which should match y_pred in per_sql.csv) is applied to the "supported" durations and combined with the unsupported" durations to produce a final per-sql speedup (speedup_pred in per_sql.csv).

Changes

Added features.csv to save the feature values used for prediction.
Moved the current shap_values.csv to feature_importance.csv (which is more descriptive of its purpose).
Used shap_values.csv to save all of the shap values per feature per instance/sqlID during prediction.
Saved a model.metrics file (for each model) during training to store the feature shap values and distribution metrics of the training set.
Renamed the model.json.cfg files to model.cfg to avoid the double-suffix.
Refactored/combined the compute_feature_importance and compute_shapley_values functions.
Updated internal predict CLI to support --qual_output argument.
Added shap command to internal CLI, which joins the prediction shap_values w/ training shap_values and distribution metrics.

Test

Following CMDs have been tested:

External Usage:

spark-rapids train
spark-rapids predict

Internal Usage:

python qualx_main.py preprocess
python qualx_main.py train
python qualx_main.py predict
python qualx_main.py shap

NVIDIA / spark-rapids-tools

Add shap command to internal CLI for debugging #1197

Changes

Test

External Usage:

Internal Usage: