Error when using sklearn StratifiedKFold in Evaluator CV

Hi there,

First of all, thanks for providing this nice library, it's really helpful in our project!

We are implementing the Evaluator class to do a grid search but our data needs stratification. We were happy to read in the documentation that the Evaluator class also accepts "a KFold class that obeys the Scikit-learn API". This would allow us to use the sklearn.model_selection.StratifiedKFold class and easily stratify our data in the cross validation.

However, when implementing this, we get the following error:

[MLENS] backend: threading
Traceback (most recent call last):
  File "mlens_kfol_cv.py", line 29, in <module>
    n_iter=10
  File "/Users/user/.python-virtualenvs/some_env/lib/python3.7/site-packages/mlens/model_selection/model_selection.py", line 492, in fit
    self._fit(X, y, job)
  File "/Users/user/.python-virtualenvs/some_env/lib/python3.7/site-packages/mlens/model_selection/model_selection.py", line 180, in _fit
    manager.process(self, job, X, y)
  File "/Users/user/.python-virtualenvs/some_env/lib/python3.7/site-packages/mlens/parallel/backend.py", line 855, in process
    caller.indexer.fit(self.job.predict_in, self.job.targets, self.job.job)
  File "/Users/user/.python-virtualenvs/some_env/lib/python3.7/site-packages/mlens/index/fold.py", line 147, in fit
    check_full_index(n, self.folds, self.raise_on_exception)
  File "/Users/user/.python-virtualenvs/some_env/lib/python3.7/site-packages/mlens/index/_checks.py", line 19, in check_full_index
    "type(%s) was passed." % type(folds))
ValueError: 'folds' must be an integer. type(<class 'sklearn.model_selection._split.KFold'>) was passed.

The error seems to contradict the documentation of the Evalutator class.

The error can be reproduced with the following (dummy) code:

import numpy as np
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import mean_absolute_error
from mlens.model_selection import Evaluator
from mlens.metrics import make_scorer
from sklearn.linear_model import Lasso
from scipy.stats import uniform

scorer = make_scorer(mean_absolute_error, greater_is_better=False)
estimators = [('lasso',Lasso())]
param_dicts = {
    'lasso':
        {'alpha': uniform(1e-6, 1e-5)},
}

x_train = np.random.rand(10,1)
y_train = np.random.rand(10)

evl = Evaluator(
    scorer,
    cv=StratifiedKFold(),
    verbose=5,
)
evl.fit(
    x_train, y_train,
    estimators=estimators,
    param_dicts=param_dicts,
    n_iter=10
)

We're using Python 3.7.6 with the following library versions:

mlens==0.2.3
scikit-learn==0.22.1
numpy==1.18.1
scipy==1.4.1

Do you have any insights on how to get this solved?

flennerhag / mlens

Error when using sklearn StratifiedKFold in Evaluator CV #135