Closed tmontana closed 4 years ago
By default, there is used mode=Explain
which probably overwrite the tuning_mode
, but it shouldn't. It should set a value only if there is no value from the user. So it is a bug after refactoring.
For now, please set the mode=Perform
as it should give you about 10 models.
you mean changing tuning_mode? I tried but still only running 1 model. I was able to get more with this:
automl.set_params(start_random_models=5, hill_climbing_steps=3, top_models_to_improve=3)
Please try:
model_types = ["Xgboost"]
automl = AutoML(
results_path="experiment_name",
tuning_mode="Normal",
total_time_limit=600 * 10,
model_time_limit=600,
algorithms=model_types,
train_ensemble=True,
explain_level=0,
stack_models=False,
validation_strategy={
"validation_type": "kfold",
"k_folds": 3,
"shuffle": False,
"stratify": True,
},
mode="Perform",
)
automl.fit(X, y)
I've added the parameter mode
. This setup will also try to create new features for you (golden features) and will do the feature selection. To disable golden features and feature selection, please add golden_features=False, features_selection=False
. But please give them a try. If you have version 0.7.1 installed you should get a nice printout for your golden features, I hope you will be excited about this feature!
ok will test and revert. Did the API also change for predict?
preds=automl.predict(X_test) used to return a df with probabilities for each class (in bin classification). Now it returns only predictions. automl.predict_proba() returns a numpy array. Is that the expected behavior?
Looks like I should be using automl.predict_all now?
thanks
Yes, predict_all
is the way to go. The changes were to be scikit-learn compatible.
predict()
returns labels for classification or predictions for regression,predict_proba()
returns probabilities for classification,predict_all()
predicts probabilities and labels for classification, for regression it will throw exception (but maybe this should be changed).Re => regression it will throw exception (but maybe this should be changed).
Maybe throw a warning instead.
thanks
I think that maybe it will be better to just run prediction even for regression. It will be duplicate with predict()
but it will return DataFrame instead of numpy array.
new features seem great. good results so far. Nice to see the library being so actively updated. cheers
OK, I look closer to this issue. I removed the tuning_mode
parameter. Its usage was ambiguous.
start_random_models
, hill_climbing_steps
and top_models_to_improve
parameters to set manually if built-in modes are not enough (for advanced users)testing now. please note small typo in docs: features_selection --> feature_selection
Hi. Not sure if new behavior is due to a change in API or a bug but the following code used to generate 9 models + ensemble. Now it only trains one (1_Default_Xgboost) and stops.
Thanks,