Choosing best model from multiple models

A silly question: but it seems to me that the auto_ml trains only the relevant GradientBoosting model by default. In order for it to train and choose from multiple models, do we just specify the models that we want to train and choose from in the model_names param?

Thanks

From the API:

model_names (list of strings) – [default- relevant ‘GradientBoosting’] Which model(s) to try. Includes many scikit-learn models, deep learning with Keras/TensorFlow, and Microsoft’s LightGBM. Currently available options from scikit-learn are [‘ARDRegression’, ‘AdaBoostClassifier’, ‘AdaBoostRegressor’, ‘BayesianRidge’, ‘ElasticNet’, ‘ExtraTreesClassifier’, ‘ExtraTreesRegressor’, ‘GradientBoostingClassifier’, ‘GradientBoostingRegressor’, ‘Lasso’, ‘LassoLars’, ‘LinearRegression’, ‘LogisticRegression’, ‘MiniBatchKMeans’, ‘OrthogonalMatchingPursuit’, ‘PassiveAggressiveClassifier’, ‘PassiveAggressiveRegressor’, ‘Perceptron’, ‘RANSACRegressor’, ‘RandomForestClassifier’, ‘RandomForestRegressor’, ‘Ridge’, ‘RidgeClassifier’, ‘SGDClassifier’, ‘SGDRegressor’]. If you have installed XGBoost, LightGBM, or Keras, you can also include [‘DeepLearningClassifier’, ‘DeepLearningRegressor’, ‘LGBMClassifier’, ‘LGBMRegressor’, ‘XGBClassifier’, ‘XGBRegressor’]. By default we choose scikit-learn’s ‘GradientBoostingRegressor’ or ‘GradientBoostingClassifier’, or if XGBoost is installed, ‘XGBRegressor’ or ‘XGBClassifier’.

yep, that's exactly what model_names is for! let me know if any of that is unclear.

so, if you want to compare a bunch of models, you can do something like this

ml_predictor.train(data, model_names=['LGBMRegressor', 'XGBRegressor', 'DeepLearningRegressor', 'LinearRegression', 'RandomForestRegressor'])

we'll then run cross-validation to choose the best model.

in general, auto_ml does a lot of this: minimizes the amount of setting-tweaking you have to do by default, but gives you most of the ability to customize yourself if you want to pass in a few more params. for instance, trainin_params={'learning_rate': 0.2, 'n_estimators': 500} will set those as parameters for the model being trained.

let me know if you have any other questions as you use this more! i'm going to close this issue, but please keep opening new ones with more questions!

ClimbsRocks / auto_ml

Choosing best model from multiple models #318