rasbt / mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.
https://rasbt.github.io/mlxtend/
Other
4.87k stars 858 forks source link

Total Loss != Bias^2 + Variance for '0-1_loss' #964

Closed ltbd78 closed 2 years ago

ltbd78 commented 2 years ago

Describe the bug

Total Loss != Bias^2 + Variance for '0-1_loss'

Steps/Code to Reproduce

# RNG
import numpy as np
rng = np.random.default_rng(seed=2022)

# DATA
from sklearn import datasets
iris = datasets.load_iris()
X = iris['data']
y = iris['target']
mask = rng.random(len(X)) < .9
X_train = X[mask]
y_train = y[mask]
X_test = X[~mask]
y_test = y[~mask]

# PIPELINES
from sklearn.linear_model import SGDClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.feature_selection import RFECV
model_default = SGDClassifier(loss='log_loss', penalty='none', fit_intercept=True)
model_l1 = SGDClassifier(loss='log_loss', penalty='elasticnet', alpha=0.1, l1_ratio=1.0, fit_intercept=True)
model_l2 = SGDClassifier(loss='log_loss', penalty='elasticnet', alpha=0.1, l1_ratio=0.0, fit_intercept=True)
pipe_default = make_pipeline(StandardScaler(), model_default)
pipe_rfe = make_pipeline(StandardScaler(), RFECV(model_default))
pipe_l1 = make_pipeline(StandardScaler(), model_l1)
pipe_l2 = make_pipeline(StandardScaler(), model_l2)
pipes = [('default', pipe_default), ('rfe', pipe_rfe), ('l1', pipe_l1), ('l2', pipe_l2)]

# MLXTEND
from mlxtend.evaluate import bias_variance_decomp
print('\t|\t Total \t\t|\t Bias^2 \t|\t Variance')
for name, pipe in pipes:
    avg_expected_loss, avg_bias, avg_var = bias_variance_decomp(
        pipe_default, X_train, y_train, X_test, y_test,
        loss='0-1_loss',
        random_seed=2022,
    )
    print(f'{name}\t|\t {avg_expected_loss:.4f} \t|\t {avg_bias:.4f} \t|\t {avg_var:.4f}')

Expected Results

Expecting Total Loss = Bias^2 + Variance

Actual Results

    |    Total      |    Bias^2     |    Variance
default |    0.0640     |    0.0667     |    0.0167
rfe |    0.0627     |    0.0667     |    0.0133
l1  |    0.0600     |    0.0667     |    0.0173
l2  |    0.0657     |    0.0667     |    0.0163

Versions

MLxtend 0.21.0dev macOS-10.16-x86_64-i386-64bit Python 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:50:38) [Clang 11.1.0 ] Scikit-learn 1.1.2 NumPy 1.23.2 SciPy 1.9.0