TeamHG-Memex / eli5

A library for debugging/inspecting machine learning classifiers and explaining their predictions
http://eli5.readthedocs.io
MIT License
2.76k stars 334 forks source link

Binary classification: weights are shown only for one class #378

Closed eloukas closed 4 years ago

eloukas commented 4 years ago

I just tried the baseline tutorial, keeping only 2 categories of classes, instead of 4.

When I render my explain_weights result, I only get weights for one class, as you see below.

Selection_032

My code looks like this:

from sklearn.datasets import fetch_20newsgroups

categories = ['alt.atheism', 'soc.religion.christian']
              # 'comp.graphics', 'sci.med']
twenty_train = fetch_20newsgroups(
    subset='train',
    categories=categories,
    shuffle=True,
    random_state=42
)
twenty_test = fetch_20newsgroups(
    subset='test',
    categories=categories,
    shuffle=True,
    random_state=42
)

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline

vec = CountVectorizer()
clf = LogisticRegression()
pipe = make_pipeline(vec, clf)
pipe.fit(twenty_train.data, twenty_train.target)

import eli5

weights_explanation = eli5.explain_weights(clf, top=30,
                                                  target_names=twenty_test.target_names)

# Save to disk
with open('linear_weights.html', 'w') as file:
    file.write(eli5.format_as_html(weights_explanation))

I am using scikit-learn==0.21.3 and eli5==0.10.1 version.

I have the same problem when running this with another dataset too. Any ideas?

Edit: Does it maybe just mean that the green weights are contributing more to class y=1 and the red weights don't contribute so much to y=1, thus, they contribute to y=0?

eloukas commented 4 years ago

Closing: I guess this might be the normal usage, according to https://stackoverflow.com/questions/51659523/eli5-show-weights-with-two-labels.