Can't interpret SHAP values from custom classifier class object

Alanapiereid commented 2 years ago

Problem: SHAP Values not interpretable catboost version: 1.0.4. Operating System: Linux CPU: 4 vCPUs, GPU: None

I have built a sentiment analysis classifier using a custom class derived from CatBoostClassifier(). I want to get feature (word) importance weights from my model, in a format such as

-0.098432 violent

-0.063453 terrible

I want these values for each word in a string that is unseen by the model (not training data, so no label). I want to do this on a per-text basis.

I have built a multi-label classifier (with 3 possible labels). Because the model is a custom class object (not out-of-the-box CatBoostClassifier), which is then saved and then loaded, it does not have the native attributes/methods of the CatBoostClassifier class. To that end I wrote the get_feature_importance function below for my custom CatBoostPipe class:

def get_feature_importance(self, sent):
    from catboost import Pool
    import pandas as pd
    df = pd.DataFrame(sent, columns=['Content'])
    return self.model.get_feature_importance(data=Pool(data=df[['Content']], 
    text_features=['Content']),type=catboost.EFstrType.ShapValues, thread_count=-1,
                                        prettified=False)

When constructing my model class, I structured the train/val/test sets using Pool() as below:

test = Pool(
    data=df_test[['Content']], 
    label=df_test['Labels'],
    text_features=['Content']
)

Here is what happens when I create an instance of my custom class, then use the get_feature_importance method that I have overwritten:

cat = CatBoostPipe().load()

sent = [f'''When, in 2018, another woman said her husband was physically violent and emotionally abusive, 
        Person2344 accused her of “lying” and asked: “Does he put up with you when you’ve been a crazy ****?”''']

shap_values = cat.get_feature_importance(sent)
print(shap_values)
print(shap_values.shape)

Output:

[[[-2.01463078 -0.71423034]
  [-2.42286171 -1.44455357]
  [ 4.04736766 -1.42623342]]]
(1, 3, 2)

I get that 1 is the number of data samples, 3 the number of classes. But what are the two features for each class? Could this be a tokenization issue? I was expecting an array of weights per word as that is what I have got before for SHAP values from other models (i.e. out of the box CatBoostClassifier + TfIDF Vectorizer fed into a sklearn Pipeline object).

Here is the full class:

class CatBoostPipe:
    def __init__(self):
       self.model = None

    def train(self, train, valid):
        self = self.fit_model(
        train, valid,
        learning_rate=0.1,
        tokenizers=[
            {
                'tokenizer_id': 'Sense',
                'separator_type': 'BySense',
                'lowercasing': 'True',
                'token_types':['Word', 'SentenceBreak']
             }      
        ],
        feature_calcers = [
            'NaiveBayes:top_tokens_count=100000'])

    def fit_model(self, train_pool, test_pool, **kwargs):
        self.model = CatBoostClassifier(
            loss_function='MultiClassOneVsAll',
            iterations=1000,
            eval_metric='Accuracy',
            od_type='Iter',
            **kwargs
        )
        self.model.fit(
                train_pool,
                eval_set=test_pool,
                plot=True,
                use_best_model=True)
        self.model.save_model('CatModel.cbm',
        format="cbm",
        pool=train_pool)
        return self

LyzhinIvan commented 2 years ago

Hi, @Alanapiereid! The docstring of get_feature_importance method says that for multiclass case it returns ndarray with shape (n_objects, classes_count, n_features + 1). The last value for each class is bias (i.e. feature-independent value).

Alanapiereid commented 2 years ago

Thank you, that's helpful. Do you know a way of getting importance values for each word? Again, I would usually just run an sklearn-compatible classifier (like out-of-the-box CatBoostClassifier() ) + a vectorizer through SHAP or LIME using a Pipeline object, but this custom class seems incompatible there due to the way the data has been wrapped in the Pool() class.

LyzhinIvan commented 2 years ago

CatBoost doesn't support of getting importance for each word, only for whole text feature.

catboost / catboost

Can't interpret SHAP values from custom classifier class object #2032