nan score for StackingClassifier due to 'scoring' argument in cross_val_score

kemaldahha commented 1 year ago

Hi, I try to run the code below (Example 1 from the StackingClassifier documentation):

from sklearn import datasets
iris = datasets.load_iris()
X, y = iris.data[:, 1:3], iris.target

from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB 
from sklearn.ensemble import RandomForestClassifier
from mlxtend.classifier import StackingClassifier
import numpy as np
import warnings

warnings.simplefilter('ignore')

clf1 = KNeighborsClassifier(n_neighbors=1)
clf2 = RandomForestClassifier(random_state=1)
clf3 = GaussianNB()
lr = LogisticRegression()
sclf = StackingClassifier(classifiers=[clf1, clf2, clf3], 
                          meta_classifier=lr)

print('3-fold cross validation:\n')

for clf, label in zip([clf1, clf2, clf3, sclf], 
                      ['KNN', 
                       'Random Forest', 
                       'Naive Bayes',
                       'StackingClassifier']):

    scores = model_selection.cross_val_score(clf, X, y, 
                                              cv=3, scoring='accuracy')
    print("Accuracy: %0.2f (+/- %0.2f) [%s]" 
          % (scores.mean(), scores.std(), label))

I get the following output:

3-fold cross validation:

Accuracy: 0.91 (+/- 0.01) [KNN]
Accuracy: 0.95 (+/- 0.01) [Random Forest]
Accuracy: 0.91 (+/- 0.02) [Naive Bayes]
Accuracy: nan (+/- nan) [StackingClassifier]

The expected output is that the score for StackingClassifier should be a number like:

3-fold cross validation:

Accuracy: 0.91 (+/- 0.01) [KNN]
Accuracy: 0.95 (+/- 0.01) [Random Forest]
Accuracy: 0.91 (+/- 0.02) [Naive Bayes]
Accuracy: 0.95 (+/- 0.02) [StackingClassifier]

When I print the warning by commenting out warnings.simplefilter('ignore'), I get the output below (I truncated it, as the warning is repeated several times):

3-fold cross validation:

Accuracy: 0.91 (+/- 0.01) [KNN]
Accuracy: 0.95 (+/- 0.01) [Random Forest]
Accuracy: 0.91 (+/- 0.02) [Naive Bayes]
[c:\projects\machine-learning-matt-harrison\env\lib\site-packages\sklearn\model_selection\_validation.py:842](file:///C:/projects/machine-learning-matt-harrison/env/lib/site-packages/sklearn/model_selection/_validation.py:842): UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details: 
Traceback (most recent call last):
  File "c:\projects\machine-learning-matt-harrison\env\lib\site-packages\sklearn\metrics\_scorer.py", line 136, in __call__
    score = scorer._score(
  File "c:\projects\machine-learning-matt-harrison\env\lib\site-packages\sklearn\metrics\_scorer.py", line 353, in _score
    y_pred = method_caller(estimator, "predict", X)
  File "c:\projects\machine-learning-matt-harrison\env\lib\site-packages\sklearn\metrics\_scorer.py", line 86, in _cached_call
    result, _ = _get_response_values(
  File "c:\projects\machine-learning-matt-harrison\env\lib\site-packages\sklearn\utils\_response.py", line 74, in _get_response_values
    classes = estimator.classes_
AttributeError: 'StackingClassifier' object has no attribute 'classes_'

The problem seems to be related to the scoring argument in scores = model_selection.cross_val_score(clf, X, y, cv=3, scoring='accuracy'). If I remove that argument, then the default scoring is used (accuracy, I think), and then I get the expected output which is the same as in the example in the documentation:

3-fold cross validation:

Accuracy: 0.91 (+/- 0.01) [KNN]
Accuracy: 0.95 (+/- 0.01) [Random Forest]
Accuracy: 0.91 (+/- 0.02) [Naive Bayes]
Accuracy: 0.95 (+/- 0.02) [StackingClassifier]

However I would like to be able to use other scoring metrics as well (e.g. roc_auc), but then I have to provide the argument explicitly and I get the nan score again for StackingClassifier.

I already checked issues #423 and #426, which mention a similar warning/error (AttributeError: 'StackingClassifier' object has no attribute 'classes_'), but I couldn't figure it out based on those issues.

I am using:

Python 3.10.0
scikit-learn==1.3.0
mlxtend==0.22.0

rasbt commented 1 year ago

Thanks for the note! I can confirm, having this issue in sklearn 1.3.0 as well (but not in 1.2.2). I just submitted a PR via #1060 to fix that

kemaldahha commented 1 year ago

I came across this lecture by @rasbt. Based on his explanation StackingClassifier was included in sklearn. I adjusted the code to use the sklearn version of StackingClassifier:

from sklearn import datasets
iris = datasets.load_iris()
X, y = iris.data[:, 1:3], iris.target

from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB 
from sklearn.ensemble import RandomForestClassifier, StackingClassifier
# from mlxtend.classifier import StackingClassifier
import numpy as np
import warnings

warnings.simplefilter('ignore')

clf1 = KNeighborsClassifier(n_neighbors=1)
clf2 = RandomForestClassifier(random_state=1)
clf3 = GaussianNB()

estimators = [("clf1", clf1),
              ("clf2", clf2),
              ("clf3", clf3)]

lr = LogisticRegression()

sclf = StackingClassifier(estimators=estimators, 
                          final_estimator=lr)

print('3-fold cross validation:\n')

for clf, label in zip([clf1, clf2, clf3, sclf], 
                      ['KNN', 
                       'Random Forest', 
                       'Naive Bayes',
                       'StackingClassifier']):

    scores = model_selection.cross_val_score(clf, X, y, cv=3, scoring="accuracy")
    print("Accuracy: %0.2f (+/- %0.2f) [%s]" 
          % (scores.mean(), scores.std(), label))

Now I do get an output more in line with what I expect, though not exactly same as in the mlxtend StackingClassifier documentation (Example 1):

3-fold cross validation:

Accuracy: 0.91 (+/- 0.01) [KNN]
Accuracy: 0.95 (+/- 0.01) [Random Forest]
Accuracy: 0.91 (+/- 0.02) [Naive Bayes]
Accuracy: 0.93 (+/- 0.02) [StackingClassifier]

Perhaps sklearn's StackingClassifier implementation is different from mlxtend's.

I am wondering whether we should still use mlxtend's StackingClassifier or whether it is deprecated and we should use sklearn's implementation instead?

kemaldahha commented 1 year ago

Thanks for the note! I can confirm, having this issue in sklearn 1.3.0 as well (but not in 1.2.2). I just submitted a PR via #1060 to fix that

Thanks for the reply. I posted my second comment before I read your reply, apologies.

rasbt / mlxtend

nan score for StackingClassifier due to 'scoring' argument in cross_val_score #1059