EpistasisLab / tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
http://epistasislab.github.io/tpot/
GNU Lesser General Public License v3.0
9.73k stars 1.57k forks source link

RuntimeError: The task could not be sent to the workers as it is too large for `send_bytes`. #788

Closed adrpino closed 6 years ago

adrpino commented 6 years ago

[provide general introduction to the issue and why it is relevant to this repository] I am trying to run the optimization pipeline on a dataset with 80k rows and 4k colums and

Context of the issue

[provide more detailed introduction to the issue itself and why it is relevant]

[the remaining entries are only necessary if you are reporting a bug]

Process to reproduce the issue

[ordered list the process to finding and recreating the issue, example below] Calling .fit() with the big dataset

Expected result

[describe what you would expect to have resulted from this process]

Current result

[describe what you currently experience from this process, and thereby explain the bug]

Possible fix

[not necessary, but suggest fixes or reasons for the bug]

name of issue screenshot

[if relevant, include a screenshot]

_RemoteTraceback                          Traceback (most recent call last)
_RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/adrian/miniconda3/envs/ml/lib/python3.6/site-packages/sklearn/externals/joblib/externals/loky/backend/queues.py", line 157, in _feed
    send_bytes(obj_)
  File "/home/adrian/miniconda3/envs/ml/lib/python3.6/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/home/adrian/miniconda3/envs/ml/lib/python3.6/multiprocessing/connection.py", line 393, in _send_bytes
    header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
"""

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
~/miniconda3/envs/ml/lib/python3.6/site-packages/tpot/base.py in fit(self, features, target, sample_weight, groups)
    660                     verbose=self.verbosity,
--> 661                     per_generation_function=self._check_periodic_pipeline
    662                 )

~/miniconda3/envs/ml/lib/python3.6/site-packages/tpot/gp_deap.py in eaMuPlusLambda(population, toolbox, mu, lambda_, cxpb, mutpb, ngen, pbar, stats, halloffame, verbose, per_generation_function)
    229 
--> 230     fitnesses = toolbox.evaluate(invalid_ind)
    231     for ind, fit in zip(invalid_ind, fitnesses):

~/miniconda3/envs/ml/lib/python3.6/site-packages/tpot/base.py in _evaluate_individuals(self, individuals, features, target, sample_weight, groups)
   1238                         delayed(partial_wrapped_cross_val_score)(sklearn_pipeline=sklearn_pipeline)
-> 1239                         for sklearn_pipeline in sklearn_pipeline_list[chunk_idx:chunk_idx + chunk_size])
   1240                     # update pbar

~/miniconda3/envs/ml/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self, iterable)
    995             with self._backend.retrieval_context():
--> 996                 self.retrieve()
    997             # Make sure that we get a last message telling us we are done

~/miniconda3/envs/ml/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in retrieve(self)
    898                 if getattr(self._backend, 'supports_timeout', False):
--> 899                     self._output.extend(job.get(timeout=self.timeout))
    900                 else:

~/miniconda3/envs/ml/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py in wrap_future_result(future, timeout)
    516         try:
--> 517             return future.result(timeout=timeout)
    518         except LokyTimeoutError:

~/miniconda3/envs/ml/lib/python3.6/concurrent/futures/_base.py in result(self, timeout)
    431             elif self._state == FINISHED:
--> 432                 return self.__get_result()
    433             else:

~/miniconda3/envs/ml/lib/python3.6/concurrent/futures/_base.py in __get_result(self)
    383         if self._exception:
--> 384             raise self._exception
    385         else:

RuntimeError: The task could not be sent to the workers as it is too large for `send_bytes`.
weixuanfu commented 6 years ago

How about using dask? I think the dataset is too large for joblib.

adrpino commented 6 years ago

Dask solves the issue but increasigngly uses more and more RAM til it crashes.

shuttle1987 commented 5 years ago

See this: https://bugs.python.org/issue17560