marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.51k stars 1.8k forks source link

Why lime text explainer does not return Class Label? #510

Closed SharathChandra-AV closed 3 years ago

SharathChandra-AV commented 3 years ago

Hello Marco, Saw your article about LIME. Its fantastic... I trained few ML.NET classifiers for the text data. However, when i try to use LimeTextExplainer, it always throws some linkage related errors explainer = lt.LimeTextExplainer(class_names=set(df['Label']))

Can you please let me know if there is compatibility between the ML.NET and LIME explainers to make it work? Thanks Sharath

marcotcr commented 3 years ago

I know nothing about ML.NET, sorry. You should be able to run LIME on regular python though...

SharathChandra-AV commented 3 years ago

Oh! My bad. I didnt provide full details. I used NimbusML , a python module which is actually a wrapper on top of ML.NET classifiers. So, below is some sample code pipe = Npipe([ NGramFeaturizer(word_feature_extractor=Ngram(weighting = 'Tf')), PcaTransformer(rank = 100), FastTreesBinaryClassifier(number_of_trees=200,), # nimbusml learner ]) X = df[textcol] Y = df[targetcol] pipe.fit(X, Y)

So, once this classifier gets trained, i am using the probabilities to get the local explanation for a sample text. Which is where I see LIME is not extracting the explanations and throwing some errors... I tried to print the probabilities from the classifiers on the test set, which it printed though correctly... So just want to check if LIME can work on any kind of classifier as by definition it is MODEL AGNOSTIC...? Please let me know...

marcotcr commented 3 years ago

It should work on any classifier. What error do you get? And what output do you get if you run the following?

pipe(['John is a man', 'John is not a man'])
SharathChandra-AV commented 3 years ago

I think it failed for both single string and list of strings. Below are the last lines of errors i got when i run the code:

For a sample document sample = 'John is a man' explainer = lt.LimeTextExplainer() Obj_le = explainer.explain_instance(doc, clf.predict_proba) features = Obj_le.as_list() BridgeRuntimeError: Error: *** System.InvalidOperationException: 'Incompatible features column type: 'Single' vs 'Vector<Single, 100>''

For list of sample documents samples = ['John is a man','John is a simple man'] explainer = lt.LimeTextExplainer() Obj_le = explainer.explain_instance(samples, pipe_mlnet.predict_proba) features = Obj_le.as_list() TypeError: expected string or bytes-like object

marcotcr commented 3 years ago

what output do you get if you run the following?

pipe(['John is a man', 'John is not a man'])

iqbalfarz commented 3 years ago

Hi @SharathChandra-AV, I was getting the same problem of "TypeError: expected string or bytes-like object". So, Let me tell you what was my problem? I used TfidfVectorizer to vectorized my text. Let me write code here:

lr_pipeline = (tfidfVectorizer, lr_model) query = pd.Series(" A person commented on me") lr_explainer = LimeTextExplainer(class_names=['Non-Commenting','Commenting'], bow=True) lr_explained = lr_explainer.explain_instance(query, lr_pipeline.predict_proba)

and I was getting an error which I have written above. So, the problem is LIME explainer needs a raw string to interpret and I was passing the Series(). So, just pass query[0]([0] will return the raw string. I hope this will help.

lr_explained = lr_explainer.explain_instance(query[0], lr_pipeline.predict_proba)