intel / scikit-learn-intelex

Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
https://intel.github.io/scikit-learn-intelex/
Apache License 2.0
1.23k stars 175 forks source link

Crashing when running sklearnex in a GridSearchCV fit of a SVR model with TransformedTargetRegressor #1027

Open joseDorronsoro opened 2 years ago

joseDorronsoro commented 2 years ago

Describe the bug

There is a crash when fitting a GridSearchCv object running an SVR model within a TransformedTargetRegresor.
No problem however, when fitting a GridSearchCv object with either a plain SVR model or a pipeline (scaler, SVR)

To Reproduce

  1. Setup:

Several tried with the same results. An example is

python 3.8.13 h6244533_0 scikit-learn 0.24.2 py38hf11a4ad_2 scikit-learn-intelex 2021.5.0 py38haa95532_0

Other versions tried are scikit-learn 1.0.1, scikit-learn-intelex 2021.5.0.

  1. Call

A minimal code to reproduce the crash is

`from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import cross_val_predict, KFold, GridSearchCV
from sklearn.svm import SVR

from sklearn.compose import TransformedTargetRegressor
from sklearn.datasets import load_boston

boston = load_boston()
x, y = boston['data'], boston['target']

n_folds = 2
kf = KFold(n_folds, shuffle=True, random_state=1)

l_C, l_gamma, l_epsilon = [1.], [1. / 200.], [0.5]

print(40 * '_', 'transf targ regr + svr + gridsearch cv fit')

svr = SVR()
y_transformer = MinMaxScaler()
inner_estimator = TransformedTargetRegressor(regressor=svr,
                                            transformer=y_transformer)

param_grid ={'regressor__C': l_C,
            'regressor__gamma': l_gamma,
            'regressor__epsilon': l_epsilon}

cv_estimator = GridSearchCV(inner_estimator, 
                            param_grid=param_grid, 
                            cv=kf, 
                            #refit=False,
                            n_jobs=1,
                            #return_train_score=True
                            )

cv_estimator.fit(x, y)`

Expected behavior

Running it as such causes no error and finishes the fit of the GridSearchCV object, but it crashes with python -m sklearnex.

First a warning is issued:

`miniconda3/lib/python3.8/site-packages/sklearn/model_selection/_search.py:969: UserWarning: One or more of the test scores are non-finite: [nan]`

followed by a ValueError

`ValueError: Input model support vectors are empty`

It appears the scores cannot be computed because model fit did not produce any support vectors.

Output/Screenshots

warnings.warn( /home/jdorrons/miniconda3/lib/python3.8/site-packages/sklearn/model_selection/_validation.py:770: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:

Traceback (most recent call last):

File "/home/jdorrons/miniconda3/lib/python3.8/site-packages/sklearn/model_selection/_validation.py", line 761, in _score scores = scorer(estimator, X_test, y_test)

File "/home/jdorrons/miniconda3/lib/python3.8/site-packages/sklearn/metrics/_scorer.py", line 418, in _passthrough_scorer return estimator.score(*args, **kwargs)

File "/home/jdorrons/miniconda3/lib/python3.8/site-packages/sklearn/base.py", line 705, in score y_pred = self.predict(X)

File "/home/jdorrons/miniconda3/lib/python3.8/site-packages/sklearn/compose/target.py", line 274, in predict pred = self.regressor.predict(X, **predict_params)

File "/home/jdorrons/miniconda3/lib/python3.8/site-packages/sklearnex/_device_offload.py", line 176, in wrapper result = func(self, *args, **kwargs)

File "/home/jdorrons/miniconda3/lib/python3.8/site-packages/sklearnex/svm/svr.py", line 46, in predict
return dispatch(self, 'svm.SVR.predict', {

File "/home/jdorrons/miniconda3/lib/python3.8/site-packages/sklearnex/_device_offload.py", line 153, in dispatch
return branches[backend](obj, *hostargs, **hostkwargs, queue=q)

File "/home/jdorrons/miniconda3/lib/python3.8/site-packages/sklearnex/svm/svr.py", line 79, in _onedal_predict
return self._onedal_estimator.predict(X, queue=queue)

File "/home/jdorrons/miniconda3/lib/python3.8/site-packages/onedal/svm/svm.py", line 354, in predict
y = super()._predict(X, _backend.svm.regression, queue)

File "/home/jdorrons/miniconda3/lib/python3.8/site-packages/onedal/svm/svm.py", line 275, in _predict
result = module.infer(policy, params, model, to_table(X))

ValueError: Input model support vectors are empty

Environment:

Checked on Windows 11 and Ubuntu.

dguijo commented 2 years ago

Hey! I've written an issue regarding a similar problem with GridSearchCV and SVR. It is here #1046. I believe it crashes with this combination, maybe it is not a problem of TransformedTargetRegressor, at least these are my findings. Thanks both in advance!

Best, David.