marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.4k stars 1.79k forks source link

Tensorflow bert and lime #718

Open GiannisHaralabopoulos opened 1 year ago

GiannisHaralabopoulos commented 1 year ago

Hello @marcotcr.

I have a pretrained keras functional model, that uses bert embeddings. What I would like to do is to be able to visualise the effect each term has on a new string based on the model.predict of that pretrained model.

Here is the code:

import tensorflow as tf
import pandas as pd
from lime.lime_text import LimeTextExplainer
from transformers import *
import numpy as np

model = tf.keras.models.load_model(
         "Models//ECG//55scipts_512maxlen_7epochs.h5",
    custom_objects={'TFBertMainLayer': TFBertMainLayer}
     )

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

df = pd.read_csv('twtud.csv') #a pandas df with a solumn of strings

def f(x):
    token = tokenizer.encode_plus(
        x,
        max_length=512,
        truncation = True,
        padding='max_length',
        return_tensors='tf'
    )
    return np.array([[float(1-x),float(x)] for x in model.predict([tf.cast(token.input_ids, tf.float64),
                       tf.cast(token.attention_mask, tf.float64)],
                        verbose=0)])

X = df['text'][:10]
explainer = LimeTextExplainer(class_names=['down','up'])
exp = explainer.explain_instance('Hello General Kenobi', f, num_features=512,num_samples=1) #Only works with sample == 1
exp.as_list() #No matter the input, the output is always zero, [('Hello', 0.0), ('General', 0.0), ('Kenobi', 0.0)]

I am just now getting the grip with your package so I am not sure that I am doing everything properly.

Thank you kindly.

ajyl commented 1 year ago

I think you want num_samples parameter in explain_instance() to be something much larger, ie the default value if I recall correctly is 5,000 or something.