ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
10.98k stars 1.18k forks source link

Model_type: GBM- "ValueError: num_actors parameter set to 0 - Missing RayParams in lightgbm_ray call" #3812

Open rishabr-aizencorp opened 7 months ago

rishabr-aizencorp commented 7 months ago

While training model using - lightgbm. I am facing the error which says - """ ValueError: The num_actors parameter is set to 0. Please always specify the number of distributed actors you want to use. FIX THIS by passing a RayParams(num_actors=X) argument to your call to lightgbm_ray. """

Installaton procedure: !pip install ludwig[tree,distributed]==0.8.2

txt file with complete error message. light_gbm_error.txt

I was able to train the model on type local but failed with the above error on ray cluster. Below is the model config file used for training: "dl_config": { "input_features": [ { "name": "I1", "type": "number"}, { "name": "I2", "type": "number"}, { "name": "I3", "type": "number"}, { "name": "C1", "type": "category"}, { "name": "C2", "type": "category"}, { "name": "C3", "type": "category"}, { "name": "C4", "type": "category"}, { "name": "C5", "type": "category"} "output_features": [ { "name": "Label", "type": "binary" } ], "model_type": "gbm", "trainer" : { "num_boost_round": 150, "early_stop": 30, "learning_rate": 0.001, "boosting_type": "gbdt", "num_leaves": 82 } } } )

rishabr-aizencorp commented 7 months ago

We were able to start the training after including setting "num_workers" : 2 as shown below. "backend": {"type":"ray", "trainer":{"num_workers": 2, "resources_per_worker":{"CPU": 1}}

However, during prediction with ludwig[distributed,tree] , we are getting below exception :

(ServeReplica:recomd.recomd_model_nw_1_serve:prediction pid=1283) from sklearn.ensemble import ( (ServeReplica:recomd.recomd_model_nw_1_serve:prediction pid=1283) ImportError: cannot import name 'HistGradientBoostingClassifier' from 'sklearn.ensemble' (/opt/foresight_venv/lib/python3.8/site-packages/sklearn/ensemble/init.py)

The complete error message - lightgbm_prediction_error.txt

The requirements for ludwig[distributed,tree]==0.8.2 - ludwig_distributed_tree.txt

arnavgarg1 commented 6 months ago

Hi @rishabr-aizencorp! Thanks for flagging the issue and for the requirements.txt file. It does seem peculiar that it complains about an importError, given that:

  1. You are on scikit-learn 0.23.2
  2. HistGradientBoostingClassifier is supported as of scikit-learn 0.21
  3. Here are the docs in scikit-learn 0.23.2 that should logically support this import: https://scikit-learn.org/0.23/modules/generated/sklearn.ensemble.HistGradientBoostingClassifier.html#sklearn.ensemble.HistGradientBoostingClassifier

I think we may need a few days to reproduce the issue and come back with a solution.