Open chwang1991 opened 5 years ago
Thanks for opening this! Sorry I don't have a more helpful update, but I just wanted to say that I'm looking into this, and I do think there's a bug here. I'm making some regression tests using the Iris dataset, with a single label-encoded target column, so I'll probably ask you to try to reproduce some of my results when I'm further along in the bug hunt.
In the meantime, have you tried adjusting the do_predict_proba
kwarg of your Environment
? Are you expecting log_loss
to be called with a single column of label-encoded predictions, or four columns of class probabilities? Because I believe the former won't work, as log_loss
automatically assumes a 1-dimensional y_pred
to be binary...
Like I said, I need to investigate some more, but I'd really appreciate you commenting any of your findings here!
Edit: Thanks for looking for related issues, as well!
Thanks for your quick reply!
Sure I tried setting do_predict_proba=True
but it didn't help. Seems that it refused to accept multi-column predicts for some reason.
I have to say log_loss
is a bit special 'cause it requires (n_samples,n_classes)
y_pred, while other examples you tested before, I guess, forced the input y_pred
to be 1 column.
Here is the code I used to test:
import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
from hyperparameter_hunter import Environment, CVExperiment, BayesianOptPro
# make a toy dataset
x,y = make_classification(n_samples=1000,n_classes=4,n_informative=10)
train_df = pd.DataFrame(x, columns=range(x.shape[1]))
train_df["y"] = y
'''
TEST 1
metrics=["log_loss"]
do_predict_proba=False
ValueError: y_true and y_pred contain different number of classes 4, 2.
Please provide the true labels explicitly through the labels argument. Classes found in y_true: [0 1 2 3]
'''
env1 = Environment(
train_dataset=train_df,
results_path="HyperparameterHunterAssets",
target_column="y",
metrics=["log_loss"],
do_predict_proba=False,
cv_type="StratifiedKFold",
cv_params=dict(n_splits=5, random_state=32),
verbose=1,
)
'''
TEST 2
metrics=dict(logloss=lambda y_true, y_pred: metrics.log_loss(y_true, y_pred, labels=[0,1,2,3]))
do_predict_proba=False
ValueError: The number of classes in labels is different from that in y_pred.
Classes found in labels: [0 1 2 3]
'''
env2 = Environment(
train_dataset=train_df,
results_path="HyperparameterHunterAssets",
target_column="y",
metrics=dict(logloss=lambda y_true, y_pred: metrics.log_loss(y_true, y_pred, labels=[0,1,2,3])),
do_predict_proba=False,
cv_type="StratifiedKFold",
cv_params=dict(n_splits=5, random_state=32),
verbose=1,
)
'''
TEST 3
metrics=["log_loss"]
do_predict_proba=True
ValueError: Wrong number of items passed 4, placement implies 1
'''
env3 = Environment(
train_dataset=train_df,
results_path="HyperparameterHunterAssets",
target_column="y",
metrics=["log_loss"],
do_predict_proba=True,
cv_type="StratifiedKFold",
cv_params=dict(n_splits=5, random_state=32),
verbose=1,
)
'''
TEST 4
metrics=dict(logloss=lambda y_true, y_pred: metrics.log_loss(y_true, y_pred, labels=[0,1,2,3]))
do_predict_proba=True
ValueError: Wrong number of items passed 4, placement implies 1
'''
env4 = Environment(
train_dataset=train_df,
results_path="HyperparameterHunterAssets",
target_column="y",
metrics=dict(logloss=lambda y_true, y_pred: metrics.log_loss(y_true, y_pred, labels=[0,1,2,3])),
do_predict_proba=True,
cv_type="StratifiedKFold",
cv_params=dict(n_splits=5, random_state=32),
verbose=1,
)
experiment = CVExperiment(
model_initializer=RandomForestClassifier,
model_init_params=dict(n_estimators=10),
)
@chwang1991,
Thanks for posting your sample code! It's very helpful! Sorry for the delay, but I've been busy with other things lately. I'm looking at this issue again today, and I have to agree with you log_loss
does seem rather weird. Although I may just be thinking that because I haven't done too much experimentation with other metrics.
Do you know of any other metrics that behave similarly or might cause other problems?
Also, do you think that another Environment
kwarg might be necessary to clear up behavior in situations like this? do_predict_proba
seems like half of the solution... But I'm thinking we need one kwarg to declare how predictions should be passed to metrics, then a second to declare how predictions should be saved in a situation like this. I'd love to hear your thoughts!
@HunterMcGushion
Sorry for my late reply, bit busy these days...
I think there are only two metrics that accept multi-column predicted probabilities: log_loss
and hinge_loss
. I think do_predict_proba
is enough, as you already indicated in the document:
If True, it will call models.Model.model.predict_proba(), and the values in all columns will be used as the actual prediction values
I know in most cases there is no need to take proba
into consideration, even logloss
is mostly applied as loss function rather than metric, but in my recent case I have to evaluate how much "confidence" the model has in the results so I can improve it.
Hi HunterMcGushion,
I am doing a multi-classification task and I wanna set
sklearn.metrics.log_loss
as the experiment metric, but I have a trouble:See, the target has 4 labels, 0 to 3. When I run the code above, it triggers a value error:
If I set
labels
for logloss metric,metrics=dict(logloss=lambda y_true, y_pred: metrics.log_loss(y_true, y_pred, labels=[0,1,2,3]))
, it throws out another error:ValueError: The number of classes in labels is different from that in y_pred.
I checked the examples and previous issues like #90, and I wonder have you tested logloss for multiclass task?