Closed username725 closed 2 years ago
To perform nested cross-validation:
sklearn.model_selection.cross_val_score(automl, X, y, cv=2)
However that requires AutoML to have a
score()
method available. Okay, let's explicitly give sklearn a scoring method:sklearn.model_selection.cross_val_score(automl, X, y, scoring='roc_auc', cv=2)
This ends up in sklearn's
_BaseScorer._select_proba_binary()
which requiresclasses_
to be a Numpy ndarray. AutoML explicitly is turning these to a list. So there is an error.Full example:
import numpy as np import sklearn from flaml import AutoML X = np.random.random(size=(10, 1)) y = np.random.choice([False, True], size=10) automl = AutoML(time_budget=5) sklearn.model_selection.cross_val_score(automl, X, y, scoring='roc_auc', cv=2)
Leads to error:
(
col_idx = np.flatnonzero(classes == pos_label)[0] IndexError: index 0 is out of bounds for axis 0 with size 0
).A workaround is to override
classes_
to have it return an array:class MyAutoML(AutoML): @property def classes_(self): return np.array(super().classes_)
Since a workaround was found, this isn't high priority, but I wonder:
- Does a
decision_function()
make sense for AutoML?
Not sure because it is not applicable to all learners and tasks.
- Does a
score()
function make sense?
Yes, it makes sense. Would you like to add it?
- Compatibility reasons to
.tolist()
the.classes_
?
We used this to make it work for automlbenchmark. Let me try converting it to np.array
. If it works, we should make it compatible.
FLAML 0.9.6, scikit-learn 1.0.2
Thanks for the quick turn around.
We can consider this Issue closed, and I can open an PR for score()
if I get a chance.
To perform nested cross-validation:
However that requires AutoML to have a
score()
method available. Okay, let's explicitly give sklearn a scoring method:This ends up in sklearn's
_BaseScorer._select_proba_binary()
which requiresclasses_
to be a Numpy ndarray. AutoML explicitly is turning these to a list. So there is an error.Full example:
Leads to error:
A workaround is to override
classes_
to have it return an array:Since a workaround was found, this isn't high priority, but I wonder:
decision_function()
make sense for AutoML?score()
function make sense?.tolist()
the.classes_
?FLAML 0.9.6, scikit-learn 1.0.2