interpretml / interpret-text

A library that incorporates state-of-the-art explainers for text-based machine learning models and visualizes the result with a built-in dashboard.
MIT License
409 stars 68 forks source link

How to go about explaining or interpreting text for chat completion models? #235

Open karrtikiyer-tw opened 4 months ago

karrtikiyer-tw commented 4 months ago

Hi community, Any thoughts on how we can get some of the lime tool explainers to work on the Open AI Chat Completion models? Any advise or help appreciated. Thanks, Karrtik

Siddharth-Latthe-07 commented 3 weeks ago

@karrtikiyer-tw Interesting topic, here are some of the introductory steps that will help you to get an idea regarding the above,

  1. install the required libraries (lime)
  2. define a wrapper function:- As LIME requires a prediction function that returns a probability distribution over classes and Since Chat Completion models provide text, you need to define a function that translates the text output into a format suitable for LIME.
    
    import openai

def openai_completion(prompt, model="gpt-4", max_tokens=50): response = openai.Completion.create( engine=model, prompt=prompt, max_tokens=max_tokens, n=1, stop=None, temperature=0.7 ) return response.choices[0].text.strip()

3. create function for lime explainer:-
For text models, it involves altering the input text slightly and observing the resulting completions.

from lime.lime_text import LimeTextExplainer import numpy as np

class OpenAIWrapper: def init(self, model="gpt-4"): self.model = model

def predict_proba(self, texts):
    # Define the class labels
    classes = ['positive', 'negative']

    # Simulate probabilities (for demonstration)
    # In practice, you'd replace this with a real probability scoring
    probas = []
    for text in texts:
        completion = openai_completion(text, model=self.model)
        # Dummy scoring: length-based probability (you'll need a real classifier)
        probas.append([len(completion) % 2, (len(completion) + 1) % 2])
    return np.array(probas)

Initialize LIME explainer and the OpenAI model wrapper

explainer = LimeTextExplainer(class_names=['positive', 'negative']) model_wrapper = OpenAIWrapper(model="gpt-4")

Example text input

text = "The weather today is"

Generate explanation

exp = explainer.explain_instance(text, model_wrapper.predict_proba, num_features=10)

Print the explanation

print(exp.as_list())


and finally, interpreting the results, through the `explain_instance` function
Hope, this would give you a brief idea, open for your thoughts on this
Thanks