mars-project / mars

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
https://mars-project.readthedocs.io
Apache License 2.0
2.68k stars 325 forks source link

[BUG] lightgbm TypeError: a bytes-like object is required, not 'list' #3348

Closed zhongchun closed 1 year ago

zhongchun commented 1 year ago

Describe the bug There is a TypeError: a bytes-like object is required, not 'list' when i run a lightgbm.LGBMClassifier.fit on a 3 nodes Mars cluster.

To Reproduce To help us reproducing this bug, please provide information below:

  1. Your Python version: python 3.7.9
  2. The version of Mars you use: 0.10.0
  3. Versions of crucial packages, such as numpy, scipy and pandas: numpy 1.21.6, pandas 1.3.5, lightgbm 3.32
  4. Full stack of the error.
  5. Minimized code to reproduce the error.

4 and 5 are as follows: I launched a 3 nodes Mars cluster with 1 supervisor and 3 workers. The Supervisor and a worker are on the same node, the other 2 worker are on two different nodes. Breast_cancer_data.csv is from https://www.kaggle.com/code/prashant111/lightgbm-classifier-in-python/input

In [1]: import pandas as pd
   ...: import mars.dataframe as md
   ...:
   ...: import mars
   ...: session = mars.new_session(f"http://6.0.55.34:9008")
   ...:
   ...: df = pd.read_csv("./Breast_cancer_data.csv")
   ...: mdf = md.DataFrame(data=df, chunk_size=300)
   ...: X = mdf[['mean_radius','mean_texture','mean_perimeter','mean_area','mean_smoothness']]
   ...: y = mdf['diagnosis']
   ...:
   ...: from mars.learn.contrib import lightgbm as lgb
   ...:
   ...: gbm = lgb.LGBMClassifier(importance_type='gain')
   ...: gbm.fit(X, y)
Metric is not initialized, please call `init_metrics()` before using metrics.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100.00/100 [00:00<00:00, 705.12it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100.00/100 [00:03<00:00, 28.08it/s]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-dfd6546be0ac> in <module>
     13
     14 gbm = lgb.LGBMClassifier(importance_type='gain')
---> 15 gbm.fit(X, y)

~/.pyenv/versions/3.7.9/envs/gmars379/lib/python3.7/site-packages/mars/learn/contrib/lightgbm/classifier.py in fit(self, X, y, sample_weight, init_score, eval_set, eval_sample_weight, eval_init_score, session, run_kwargs, **kwargs)
     53                 session=session,
     54                 run_kwargs=run_kwargs,
---> 55                 **kwargs
     56             )
     57

~/.pyenv/versions/3.7.9/envs/gmars379/lib/python3.7/site-packages/mars/learn/contrib/lightgbm/_train.py in train(params, train_set, eval_sets, **kwargs)
    454     ret = op().execute(session=session, **run_kwargs).fetch(session=session)
    455
--> 456     bst = pickle.loads(ret)
    457     evals_result.update(bst.evals_result_ or {})
    458     return bst

TypeError: a bytes-like object is required, not 'list'

Expected behavior A clear and concise description of what you expected to happen.

Additional context Add any other context about the problem here.