Open SamyakHM opened 3 years ago
I think this would be a great addition, and relatively easy to implement.
in line 558 of feature_elimination.py:
self._report_current_results(
round_number=round_number,
current_features_set=current_features_set,
features_to_remove=features_to_remove,
train_metric_mean=np.round(np.mean(scores_train), 3),
train_metric_std=np.round(np.std(scores_train), 3),
val_metric_mean=np.round(np.mean(scores_val), 3),
val_metric_std=np.round(np.std(scores_val), 3), )
onee can pass shap_importance_df
and store the shap values there as well, possibly as dict or something else.
Another improvement would be to write a small function to retrieve shap values from the results for a given number of features e.g. get_reduced_feature_set_shap_values
.
Anyone would like to implement this issue?
Problem Description Currently, it is not possible to store/retrieve the SHAP Values for individual features before they are eliminated to give a reduced feature set. This limits the analysis of SHAP values across multiple runs.
Desired Outcome The SHAP values computed after every run should be available as a dataframe for us to analyze/manipulate. This will help get an overview of how SHAP values stack up for different feature group without automatic elimination.
Solution Outline No particular requirement.