Hello,
After completing HP optimisation, I extract the bestestimator which looks like this:
Note, the final predictor is an XGBTClassifier that is encompassed within CalibratedClassifierCV so that I get calibrated rather than raw probabilities as an output. This entire pipeline is my model.
However, passing this pipeline to ClassifierExplainer as follows:
exp = ClassifierExplainer(best_est, X_test, y_test, shap='tree',)
results in the error as in the title with the log below:
splitting pipeline...
Failed to retrieve new column names from transformer_pipeline.get_feature_names_out()!
Pipeline does not have a functioning .get_feature_names_out() method, nor do all pipeline steps return the same number of columns as input, so assigning columns names 'col1', 'col2', etc instead!
Detected sklearn/imblearn Pipeline and succesfully extracted final output dataframe with column names and final model...
Note: model_output=='probability', so assuming that raw shap output of CalibratedClassifierCV is in probability space...
Generating self.shap_explainer = shap.TreeExplainer(model)
File ~\Miniconda3\envs\skl_310\lib\site-packages\explainerdashboard\explainers.py:2506, in ClassifierExplainer.init(self, model, X, y, permutation_metric, shap, X_background, model_output, cats, cats_notencoded, idxs, index_name, target, descriptions, n_jobs, permutation_cv, cv, na_fill, precision, shap_kwargs, labels, pos_label)
2500 print(
2501 f"model_output=='probability' does not work with multiclass "
2502 "XGBClassifier models, so settings model_output='logodds'..."
2503 )
2504 self.modeloutput = "logodds"
-> 2506 = self.shap_explainer
File ~\Miniconda3\envs\skl_310\lib\site-packages\explainerdashboard\explainers.py:2686, in ClassifierExplainer.shap_explainer(self)
2680 print(
2681 f"Note: model_output=='probability', so assuming that raw shap output of {model_str} is in probability space..."
2682 )
2683 print(
2684 f"Generating self.shap_explainer = shap.TreeExplainer(model{', X_background' if self.X_background is not None else ''})"
2685 )
-> 2686 self._shap_explainer = shap.TreeExplainer(
2687 self.model, self.X_background
2688 )
2690 elif self.shap == "linear":
2691 if self.model_output == "probability":
File ~\Miniconda3\envs\skl_310\lib\site-packages\shap\explainers_tree.py:149, in Tree.init(self, model, data, model_output, feature_perturbation, feature_names, approximate, **deprecated_options)
147 self.feature_perturbation = feature_perturbation
148 self.expected_value = None
--> 149 self.model = TreeEnsemble(model, self.data, self.data_missing, model_output)
150 self.model_output = model_output
151 #self.model_output = self.model.model_output # this allows the TreeEnsemble to translate model outputs types by how it loads the model
File ~\Miniconda3\envs\skl_310\lib\site-packages\shap\explainers_tree.py:993, in TreeEnsemble.init(self, model, data, data_missing, model_output)
991 self.base_offset = model.init_params[param_idx]
992 else:
--> 993 raise InvalidModelError("Model type not yet supported by TreeExplainer: " + str(type(model)))
995 # build a dense numpy version of all the tree objects
996 if self.trees is not None and self.trees:
InvalidModelError: Model type not yet supported by TreeExplainer: <class 'sklearn.calibration.CalibratedClassifierCV'>
Hello, After completing HP optimisation, I extract the bestestimator which looks like this:
Note, the final predictor is an XGBTClassifier that is encompassed within CalibratedClassifierCV so that I get calibrated rather than raw probabilities as an output. This entire pipeline is my model.
However, passing this pipeline to ClassifierExplainer as follows:
exp = ClassifierExplainer(best_est, X_test, y_test, shap='tree',)
results in the error as in the title with the log below:Is there a way around this issue?
Any help will be welcome.
Regards Narayan