oegedijk / explainerdashboard

Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.
http://explainerdashboard.readthedocs.io
MIT License
2.32k stars 332 forks source link

InvalidModelError: Model type not yet supported by TreeExplainer: <class 'sklearn.calibration.CalibratedClassifierCV'> #257

Open RNarayan73 opened 1 year ago

RNarayan73 commented 1 year ago

Hello, After completing HP optimisation, I extract the bestestimator which looks like this: image

Note, the final predictor is an XGBTClassifier that is encompassed within CalibratedClassifierCV so that I get calibrated rather than raw probabilities as an output. This entire pipeline is my model.

However, passing this pipeline to ClassifierExplainer as follows: exp = ClassifierExplainer(best_est, X_test, y_test, shap='tree',) results in the error as in the title with the log below:

splitting pipeline... Failed to retrieve new column names from transformer_pipeline.get_feature_names_out()! Pipeline does not have a functioning .get_feature_names_out() method, nor do all pipeline steps return the same number of columns as input, so assigning columns names 'col1', 'col2', etc instead! Detected sklearn/imblearn Pipeline and succesfully extracted final output dataframe with column names and final model... Note: model_output=='probability', so assuming that raw shap output of CalibratedClassifierCV is in probability space... Generating self.shap_explainer = shap.TreeExplainer(model)


InvalidModelError Traceback (most recent call last) Cell In[125], line 1 ----> 1 exp = ClassifierExplainer(best_est, X_test, y_test, #cv=CV_HPT, 2 #X_background=X_train, 3
4 #target=LABEL, labels=['loss', 'win'], 5 #cats=COLS['cat'] if GB_CAT==False else None, 6 #cats={x[0]: x[1] for x in zip(COLS['cat'], OneHotEncoder(sparseoutput=False).fit(X[COLS['cat']]).categories)} if GB_CAT==False else None, 7
8 shap='tree', #['guess', 'tree', 'linear', 'kernel', 'deep'] 9 #model_output='logodds', #['probability', 'logodds'] 10 #precision='float32', #'float64', # 11 ) 12 exp

File ~\Miniconda3\envs\skl_310\lib\site-packages\explainerdashboard\explainers.py:2506, in ClassifierExplainer.init(self, model, X, y, permutation_metric, shap, X_background, model_output, cats, cats_notencoded, idxs, index_name, target, descriptions, n_jobs, permutation_cv, cv, na_fill, precision, shap_kwargs, labels, pos_label) 2500 print( 2501 f"model_output=='probability' does not work with multiclass " 2502 "XGBClassifier models, so settings model_output='logodds'..." 2503 ) 2504 self.modeloutput = "logodds" -> 2506 = self.shap_explainer

File ~\Miniconda3\envs\skl_310\lib\site-packages\explainerdashboard\explainers.py:2686, in ClassifierExplainer.shap_explainer(self) 2680 print( 2681 f"Note: model_output=='probability', so assuming that raw shap output of {model_str} is in probability space..." 2682 ) 2683 print( 2684 f"Generating self.shap_explainer = shap.TreeExplainer(model{', X_background' if self.X_background is not None else ''})" 2685 ) -> 2686 self._shap_explainer = shap.TreeExplainer( 2687 self.model, self.X_background 2688 ) 2690 elif self.shap == "linear": 2691 if self.model_output == "probability":

File ~\Miniconda3\envs\skl_310\lib\site-packages\shap\explainers_tree.py:149, in Tree.init(self, model, data, model_output, feature_perturbation, feature_names, approximate, **deprecated_options) 147 self.feature_perturbation = feature_perturbation 148 self.expected_value = None --> 149 self.model = TreeEnsemble(model, self.data, self.data_missing, model_output) 150 self.model_output = model_output 151 #self.model_output = self.model.model_output # this allows the TreeEnsemble to translate model outputs types by how it loads the model

File ~\Miniconda3\envs\skl_310\lib\site-packages\shap\explainers_tree.py:993, in TreeEnsemble.init(self, model, data, data_missing, model_output) 991 self.base_offset = model.init_params[param_idx] 992 else: --> 993 raise InvalidModelError("Model type not yet supported by TreeExplainer: " + str(type(model))) 995 # build a dense numpy version of all the tree objects 996 if self.trees is not None and self.trees:

InvalidModelError: Model type not yet supported by TreeExplainer: <class 'sklearn.calibration.CalibratedClassifierCV'>

Is there a way around this issue?

Any help will be welcome.

Regards Narayan

gsahasrabudhe-rsi commented 3 months ago

Hello! I was wondering if you remember how you solved this issue? Been stuck on it for a few days now.