DistrictDataLabs / yellowbrick

Visual analysis and diagnostic tools to facilitate machine learning model selection.
http://www.scikit-yb.org/
Apache License 2.0
4.3k stars 559 forks source link

Issue with classification_report function when there are missing labels during scoring #1246

Closed admo1 closed 2 years ago

admo1 commented 2 years ago

Describe the bug Yellowbrick's classification_report function crashes when there are missing labels during scoring.

To Reproduce

import numpy as np
from sklearn.linear_model import LogisticRegression
from yellowbrick.classifier.classification_report import ClassificationReport

X_train = np.array([[1, 2], [1, 2], [1, 2]])
y_train = np.array([0, 1, 2])

X_test = np.array([[1, 2], [1, 2]])
y_test = np.array([0, 1])

viz = ClassificationReport(LogisticRegression())
viz.fit(X_train, y_train)
viz.score(X_test, y_test)
viz.show()

Expected behavior The classification report should not crash and instead generate a valid plot with a value of 0 for labels not present in the test data.

Traceback

Traceback (most recent call last):
  File ".\test.py", line 13, in <module>
    viz.score(X_test, y_test)
  File "C:\Users\*\AppData\Local\Programs\Anaconda3\envs\*\lib\site-packages\yellowbrick\classifier\classification_report.py", line 210, in score
    self.draw()
  File "C:\Users\*\AppData\Local\Programs\Anaconda3\envs\*\lib\site-packages\yellowbrick\classifier\classification_report.py", line 223, in draw
    cr_display[idx, jdx] = self.scores_[metric][cls]
KeyError: 2

Desktop