marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.64k stars 1.81k forks source link

Inverse scaling/ normalization to get actual unscaled values in explanation : Old issue but I made a way around #750

Open ShahrinNakkhatra-optimizely opened 4 months ago

ShahrinNakkhatra-optimizely commented 4 months ago

Not sure if this is the accurate way as I also searched for it a lot and didn't find any answer. But I'm doing it this way:

This will provide you the explanations according to your unscaled values. I have removed some lines from my code for privacy, but you get the idea.

Can someone from the dev team please respond to this so that I can be sure that this is the correct approach? @marcotcr

    def explain_pipe(self, selected_df=None, cols=None):
        if selected_df is None:
            selected_df = self.selected_df

        if cols is None:
            cols = self.cols

        temp_df = pd.DataFrame(selected_df, columns=cols)
        selected_df_ = temp_df.copy()
        dp = DataProcessingPrediction(selected_df_, self.local_directory, self.product)

        selected_df_ = dp.scale_df(
            scaler_path=os.path.join(self.local_directory, "scaler_objects.pkl"),
            col_names_path=os.path.join(self.local_directory, "scaled_col_names.pkl"),
        )

        selected_df_ = dp.clean_column_names()

        selected_df_ = dp.load_and_reorder(os.path.join(self.local_directory, "column_order.pkl"))

        output = self.model.predict_proba(selected_df_)  # [ :,1]
        return output

    def explain_row(self, X_train, X_pred, row_number: int):
        lime_explainer = lime_tabular.LimeTabularExplainer(
            training_data=np.array(X_train),
            training_labels=self.training_labels,
            feature_names=X_train.columns,
            class_names=[.....],
            mode="classification",
        )

        instance = X_pred.iloc[row_number]
        lime_exp = lime_explainer.explain_instance(data_row=instance, predict_fn=self.explain_pipe)
        logging.info(lime_exp.as_list())
        return lime_exp