AxeldeRomblay / MLBox

MLBox is a powerful Automated Machine Learning python library.
https://mlbox.readthedocs.io/en/latest/
Other
1.49k stars 274 forks source link

XGBoost is not #93

Closed edanweis closed 4 years ago

edanweis commented 4 years ago

MLBox throws an error when I use XGBoost as classifier running regression prediction.

ValueError: Pipeline cannot be set with these parameters. Check the name of your stages.

I have followed several tutorials using XGBoost without success. However, LightGBM works fine. I have tried eliminating different Hyper-parameters, but still get the error.

Is there an up to date params dict I can use as a guide?

Operating System Windows 10, Ubuntu 18.04

Local Setup MLBox 0.8.2 Anaconda Python 3.7 XGBoost 0.90 Scikit-learn 0.21.3

Example

import pandas as pd
from sklearn.model_selection import train_test_split
import xgboost as xgb
import mlbox

data=mlb.preprocessing.Reader(sep=",").train_test_split(["data.csv","data_test.csv"],'independent_variable')
data=mlb.preprocessing.Drift_thresholder().fit_transform(data)
space_xgb = {
'ne__numerical_strategy'    :{"search":"choice",
                              "space":[0,'mean','median','most_frequent']},
'ne__categorical_strategy'  :{"search":"choice",
                              "space":["None"]},
'ce__strategy'              :{"search":"choice",
                              "space":['label_encoding','entity_embedding','dummification']},
'fs__strategy'              :{"search":"choice",
                              "space":['l1','variance','rf_feature_importance']},
'fs__threshold'             :{"search":"uniform",
                              "space":[0.01,0.6]},
'est__strategy'             :{"search":"choice",
                              "space":["XGBoost"]},
'est__max_depth'            :{"search":"choice",
                              "space":[3,4,5,6,7]},
'est__learning_rate'        :{"search":"uniform",
                              "space":[0.01,0.1]},
'est__subsample'            :{"search":"uniform",
                              "space":[0.4,0.9]},
'est__reg_alpha'            :{"search":"uniform",
                              "space":[0,10]},
'est__reg_lambda'           :{"search":"uniform",
                              "space":[0,10]},
'est__n_estimators'         :{"search":"choice",
                              "space":[1000,1250,1500]}
}

best_xgb=mlb.optimisation.Optimiser(scoring="r2",n_folds=5).optimise(space_xgb,data)
mlb.prediction.Predictor().fit_predict(best_xgb,data)
AxeldeRomblay commented 4 years ago

Hello @edanweis, XGBoost is no longer supported in MLBox... it was too complicated to install so I decided to drop it ! You can use instead LightGBM (default choice).