ageron / handson-ml

⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.
Apache License 2.0
25.12k stars 12.91k forks source link

Chapter 3 : Exercise 1 - MNIST Classifier with 97% accuracy - Could not pickle the task to send it to the workers. #658

Open a-d14 opened 2 years ago

a-d14 commented 2 years ago

Ran this code for the first exercise -

from sklearn.model_selection import GridSearchCV
param_grid = [{'weights': ["uniform", "distance"], 'n_neighbors': [3, 4, 5]}]
knn_clf = KNeighborsClassifier()
grid_search = GridSearchCV(knn_clf, param_grid, cv=5, verbose=3, n_jobs=-1)
grid_search.fit(X_train, y_train)

I got this error - Could not pickle the task to send it to the workers.

How do I resolve this?

rafayqayyum commented 1 year ago

I was unable to reproduce this error. Try updating your sklearn and joblib. pip install --upgrade [Package name]

or if you have conda installed conda update [package name]

GeoSegun commented 1 year ago

This error usually occurs when you are trying to pass an object to the workers for parallel processing that cannot be pickled. Here, it seems like the KNeighborsClassifier() object is the one causing the issue.

One solution is to define a function that creates and returns the KNeighborsClassifier() object, and then pass this function to GridSearchCV() instead of the object itself.

from sklearn.model_selection import GridSearchCV from sklearn.neighbors import KNeighborsClassifier

def create_knn(): return KNeighborsClassifier()

param_grid = [{'weights': ["uniform", "distance"], 'n_neighbors': [3, 4, 5]}]

knn_clf = create_knn()

grid_search = GridSearchCV(knn_clf, param_grid, cv=5, verbose=3, n_jobs=-1)

grid_search.fit(X_train, y_train)

This should solve the error you are facing

maciejskorski commented 1 year ago

There is nothing wrong with this code on small data.

This error occrus also as a manifestation of no space left. This happens when the memory cannot accommodate data of all workers. The trace then looks like follows

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/joblib/externals/loky/backend/queues.py", line 125, in _feed
    obj_ = dumps(obj, reducers=reducers)
  File "/opt/conda/lib/python3.10/site-packages/joblib/externals/loky/backend/reduction.py", line 211, in dumps
    dump(obj, buf, reducers=reducers, protocol=protocol)
  File "/opt/conda/lib/python3.10/site-packages/joblib/externals/loky/backend/reduction.py", line 204, in dump
    _LokyPickler(file, reducers=reducers, protocol=protocol).dump(obj)
  File "/opt/conda/lib/python3.10/site-packages/joblib/externals/cloudpickle/cloudpickle_fast.py", line 632, in dump
    return Pickler.dump(self, obj)
  File "/opt/conda/lib/python3.10/site-packages/joblib/_memmapping_reducer.py", line 446, in __call__
    for dumped_filename in dump(a, filename):
  File "/opt/conda/lib/python3.10/site-packages/joblib/numpy_pickle.py", line 553, in dump
    NumpyPickler(f, protocol=protocol).dump(value)
  File "/opt/conda/lib/python3.10/pickle.py", line 487, in dump
    self.save(obj)
  File "/opt/conda/lib/python3.10/site-packages/joblib/numpy_pickle.py", line 352, in save
    wrapper.write_array(obj, self)
  File "/opt/conda/lib/python3.10/site-packages/joblib/numpy_pickle.py", line 134, in write_array
    pickler.file_handle.write(chunk.tobytes('C'))
OSError: [Errno 28] No space left on device
"""

The above exception was the direct cause of the following exception:

PicklingError                             Traceback (most recent call last)
...
PicklingError: Could not pickle the task to send it to the workers