marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.55k stars 1.8k forks source link

Repeated words in limetext #623

Closed palatos closed 3 years ago

palatos commented 3 years ago

I noticed that the lime text explanations seem to give the same weights for all occurences of a same word in a sentence. For instance in the sentence: "I'm positive the result was positive", both "positive" would get the same weight. Is this the intended behavior?

Isn't it possible that different occurences of a word in a sentence bring different meanings depending on the context?

marcotcr commented 3 years ago

The bow argument in LimeTextExplainer controls this behavior. From the docstring:

           bow: if True (bag of words), will perturb input data by removing
            all occurrences of individual words or characters.
            Explanations will be in terms of these words. Otherwise, will
            explain in terms of word-positions, so that a word may be
            important the first time it appears and unimportant the second.
            Only set to false if the classifier uses word order in some way
            (bigrams, etc), or if you set char_level=True.