sb-ai-lab / LightAutoML

Fast and customizable framework for automatic ML model creation (AutoML)
https://developers.sber.ru/portal/products/lightautoml
Apache License 2.0
1.12k stars 52 forks source link

Serialization-related exceptions when using cpu_limit > 1 #17

Closed unconverged closed 1 year ago

unconverged commented 2 years ago

🐛 Bug

Hi! I'm trying to experiment with LAMA using the simplest notebook, the one called Tutorial_1_basics.ipynb using Google Colab. I'm using just an ordinary Colab environment, without any pre-configuration.

The execution fails when I set cpu_limit to something greater than 1:

TabularAutoML(
    cpu_limit = 2,
)

with the following exception:

---------------------------------------------------------------------------
_RemoteTraceback                          Traceback (most recent call last)
_RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/joblib/externals/loky/process_executor.py", line 407, in _process_worker
    call_item = call_queue.get(block=True, timeout=timeout)
  File "/usr/lib/python3.7/multiprocessing/queues.py", line 113, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'new_block' on <module 'pandas.core.internals.blocks' from '/usr/local/lib/python3.7/dist-packages/pandas/core/internals/blocks.py'>
"""

The above exception was the direct cause of the following exception:

BrokenProcessPool                         Traceback (most recent call last)
<timed exec> in <module>

[/usr/local/lib/python3.7/dist-packages/lightautoml/automl/presets/tabular_presets.py](https://localhost:8080/#) in fit_predict(self, train_data, roles, train_features, cv_iter, valid_data, valid_features, log_file, verbose)
    547             data, _ = read_data(valid_data, valid_features, self.cpu_limit, self.read_csv_params)
    548 
--> 549         oof_pred = super().fit_predict(train, roles=roles, cv_iter=cv_iter, valid_data=valid_data, verbose=verbose)
    550 
    551         return cast(NumpyDataset, oof_pred)

10 frames
[/usr/lib/python3.7/concurrent/futures/_base.py](https://localhost:8080/#) in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

If I set cpu_limit=1, this exception is not thrown and the execution completes successfully.

To Reproduce

Steps to reproduce the behavior:

  1. Open Tutorial_1_basics.ipynb
  2. Run all of the cells needed to configure the task
  3. Run the cell with code oof_pred = automl.fit_predict(tr_data, roles = roles, verbose = 1)

Expected behavior

We except that the execution completes without an error

alexmryzhkov commented 1 year ago

Hi @IvaYan,

could you please try to run !pip install -U pandas after LightAutoML installation?

Alex

unconverged commented 1 year ago

Hi! This indeed has solved the problem. Although it appears nothing has been actually installed.

alexmryzhkov commented 1 year ago

@IvaYan, sounds great.

We have made a new release of LightAutoML yesterday evening (version 0.3.7) with several bugfixes and the new CV tutorial - it also fixes some dependencies problems.

So for now we can close the issue as everything works fine?

Alex