dask / dask-ml

Scalable Machine Learning with Dask
http://ml.dask.org
BSD 3-Clause "New" or "Revised" License
893 stars 255 forks source link

Error with dask-seachcv.GridsearchCV #539

Closed FritzPeleke closed 5 years ago

FritzPeleke commented 5 years ago

hello, when i try to run Gridsearch with sklearn on my laptop I get the parallelism error for n_jobs greater as 1. So I decided to turn to dask-searchcv and I can't get the best_params. I get a new error instead. Below are my error and the code

Error: ImportError: cannot import name 'DeprecationDict' from 'sklearn.utils.deprecation'

code: import numpy as np import pandas as pd import matplotlib import matplotlib.pyplot as plt from sklearn.datasets import fetch_openml from sklearn.neighbors import KNeighborsClassifier from sklearn.metrics import confusion_matrix from sklearn.model_selection import cross_val_predict,cross_val_score

from sklearn.model_selection import GridSearchCV

from dask_searchcv import GridSearchCV

np.random.seed(42) mnist_data = fetch_openml('mnist_784') data = mnist_data['data'].astype(np.int64) label = mnist_data['target'].astype(np.int64)

creating test data and shuffle data

x_train = data[:60000] x_test = data[60000:] y_train = label[:60000] y_test = data[60000:] schuffler = np.random.permutation(len(x_train)) x_train,y_train = x_train[schuffler],y_train[schuffler] knn_clf = KNeighborsClassifier() classifier = knn_clf.fit(x_train,y_train) param = {'n_neighbors': [3,4,5],'weights':['uniform','distance']}

using dask-search cv

search = GridSearchCV(knn_clf,param,cv=3,n_jobs=-1) search.fit(x_train,y_train) print(search.bestparams)

TomAugspurger commented 5 years ago

What version of dask-ml?

FritzPeleke commented 5 years ago

version 1.0.0

TomAugspurger commented 5 years ago

Whats the full traceback?

Your error message doesn't make sense for dask-ml 1.0.0. We define it in https://github.com/dask/dask-ml/blob/0db2e5fbf2dfab1a3bd8f924d002b7bae96f1e76/dask_ml/model_selection/utils.py#L103

FritzPeleke commented 5 years ago

Hey @TomAugspurger, I found the issue. When I call the Gridsearchcv from dask-ml it works fine. The error came as a result of me importing it from dask_searchcv. It seems dask_searchcv doesn't work again or something. I took it out of the code and the code gave an output. Thanks :)

TomAugspurger commented 5 years ago

dask-searchcv was folded into dask-ml a while ago.

On Mon, Aug 19, 2019 at 9:39 AM FritzPeleke notifications@github.com wrote:

Hey @TomAugspurger https://github.com/TomAugspurger, I found the issue. When I call the Gridsearchcv from dask-ml it works fine. The error came as a result of me importing it from dask_searchcv. It seems dask_searchcv doesn't work again or something. I took it out of the code and the code gave an output. Thanks :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/dask-ml/issues/539?email_source=notifications&email_token=AAKAOIXQ7GGHQRISZXBCHK3QFKWCVA5CNFSM4IMZK3N2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4TFEQI#issuecomment-522605121, or mute the thread https://github.com/notifications/unsubscribe-auth/AAKAOIW3L6VULWXEV3WL4PLQFKWCVANCNFSM4IMZK3NQ .

FritzPeleke commented 5 years ago

I had no idea, sorry for the trouble

TomAugspurger commented 5 years ago

No worries.