Open LishengSun opened 7 years ago
estim = HyperoptEstimator(classifier=svc('my_est'), algo=tpe.suggest, ...)
multi_clf = OneVsRestClassifier(estim)
or possibly build a function in hyperop-sklearn
to handle this kind of classification:
estim = HyperoptEstimator(classifier=one_vs_rest('my_multi_clf', clf=svc('my_est')), algo=tpe.suggest, ...)
The main difference between these two implementations is the first one will allow different parameters to be used for each of the individual classifiers, including different choices of the classifier itself. The second implementation will be more restricted, but should be a lot faster.
use_partial_fit
and trial_timeout
can definitely help with this. The use_partial_fit
flag adds some checks to see if the current evaluation is unlikely to perform better than the current best and will do early stopping and go onto the next point, which can save a lot of time. Not all classifiers in sklearn
support partial fit, so in those cases some options you can try are training in parallel, reducing the training data, lowering the timeout, or shrinking the search space.Thank you for your reply!
1) Are you going to include this function soon? Actually I am building a AutoML benchmark and would like to include hyperopt. We have different tasks (binary classification, regression and multiclass/label classification). Maybe I can help with this in my spare time if you need.
2) I don't quite understand what trial_timeout
does. Does it output a model when times out even it is not a converged solution?
Thank you in advance!
trial_timeout
is the maximum amount of time each evaluation is given to complete. For example, if trial_timeout
is set to 300
seconds, and max_evals
is set to 10
, then the total search process will run for a maximum of 50 minutes. If an individual trial times out, it will report a failure with no model output unless use_partial_fit
is also set to True
. This flag allows non-converged solutions to be returned when the time is up (Note: not all classifiers in sklearn
support this, but hyperopt-sklearn
will do a check and use it if it can). If you set trial_timeout
to None
it will default to Infinity.
I've put together some multiclass functions in a new branch. I haven't done much testing, but they seem to work, at least on this example I'll post below. Feel free to try them out. Help and suggestions are always welcome :)
from hpsklearn import HyperoptEstimator, svc, one_vs_rest, one_vs_one, output_code
from hyperopt import tpe
from sklearn import datasets
import numpy as np
iris = datasets.load_iris()
X, y = iris.data, iris.target
test_size = int(0.2 * len(y))
np.random.seed(13)
indices = np.random.permutation(len(X))
X_train = X[indices[:-test_size]]
y_train = y[indices[:-test_size]]
X_test = X[indices[-test_size:]]
y_test = y[indices[-test_size:]]
# These will default to search the classifiers in the 'any_classifier' space
#clf = one_vs_rest('clf')
#clf = one_vs_one('clf')
#clf = output_code('clf')
# This is how you choose a specific classifier to use
clf = one_vs_rest('clf', estimator=svc('my_est'))
estim = HyperoptEstimator(classifier=clf, preprocessing=[], algo=tpe.suggest, trial_timeout=120, max_evals=10)
estim.fit(X_train, y_train)
print(estim.trials.results)
print('Score:', estim.score(X_test, y_test))
print(estim.best_model())
Thank you very much!
I will give the new branch a try ASAP.
Please forgive me how should I apply hyperopt-sklearn for multi class target prediction
my y_train.shape is equal to (1000,5)
from hpsklearn import HyperoptEstimator, any_classifier
estim = HyperoptEstimator( classifier=any_classifier('clf'),
algo=tpe.suggest, trial_timeout=300)
estim.fit( x_train, y_train )
its giving this error
ValueError: bad input shape (956, 5)
@potholiday For multi-label classification you need to use the One-vs-Rest classifier.
from hpsklearn import HyperoptEstimator, one_vs_rest
estim = HyperoptEstimator( classifier=one_vs_rest('clf'),
algo=tpe.suggest, trial_timeout=300)
estim.fit( x_train, y_train )
Thanks for the quick reply. I got this output for iris data set (output variable one-hot encoded)
print(estim.trials.results)
[{'status': 'ok', 'loss': 0.04166666666666663, 'duration': 0.11149907112121582, 'loss_variance': 0.00056240219092331724}, {'status': 'ok', 'loss': 0.33333333333333337, 'duration': 0.11178112030029297, 'loss_variance': 0.0031298904538341159}, {'status': 'ok', 'loss': 0.08333333333333337, 'duration': 0.0271151065826416, 'loss_variance': 0.0010758998435054779}, {'status': 'ok', 'loss': 0.11111111111111116, 'duration': 0.2485649585723877, 'loss_variance': 0.0013910624239262743}, {'status': 'ok', 'loss': 0.02777777777777779, 'duration': 1.0009288787841797, 'loss_variance': 0.00038036863154234063}, {'status': 'ok', 'loss': 0.02777777777777779, 'duration': 26.054779052734375, 'loss_variance': 0.00038036863154234063}, {'status': 'ok', 'loss': 0.33333333333333337, 'duration': 0.020457983016967773, 'loss_variance': 0.0031298904538341159}, {'status': 'ok', 'loss': 0.08333333333333337, 'duration': 12.955389022827148, 'loss_variance': 0.0010758998435054779}, {'status': 'ok', 'loss': 0.02777777777777779, 'duration': 0.09855389595031738, 'loss_variance': 0.00038036863154234063}, {'status': 'ok', 'loss': 0.09722222222222221, 'duration': 0.32352304458618164, 'loss_variance': 0.0012361980525126062}]
print(estim.score(X_test, y_test))
0.93333333333333
print(estim.best_model())
{'learner': OneVsRestClassifier(estimator=AdaBoostClassifier(algorithm='SAMME.R', base_estimator=None,
learning_rate=0.00561843035481, n_estimators=162, random_state=0),
n_jobs=1), 'preprocs': (), 'ex_preprocs': ()}
check print(estim.best_model())
result,what is the meaning of that result? How should I save the best model?
I am not from a computational background and so please forgive me if I am asking a stupid question. How does this module really works? For finding the best model this module do exhaustive search for models and its parameters? or this have some logical way like back prop in neural networks to reach the best model and its parameters.
I forgot to mention one important thing for multi class prediction is accuracy is the best way to estimate the score?. Shouldn't we use auc,roc or some other metrics for measuring the output result
@potholiday the output of estim.best_model()
contains the trained model with the best parameter setting, along with any preprocessing that goes along with it. This is found by exploring the parameter space based on the search algorithm (the algo
parameter) used. Its impossible to do an exhaustive search in a continuous space, but the algorithm can spend its time in more promising areas. To use the model, you can do something like this:
model = estim.best_model()['learner']
From there you can do anything you want with the model, such as using it for prediction, saving it to a file, etc. If you want a metric besides accuracy that is certainly possible (and for multilabel that often makes more sense).
# some stuff you can do
from sklearn.metrics import roc_auc_score
pred = model.predict(X_test)
print(roc_auc_score(y_test, pred)
print(my_custom_metric(y_test, pred)
pickle.dump(model, open("my_model.pkl", "wb"))
#etc
Hi @bjkomer, Still talking about the configuration found by the method, is there a way to have access, not only to the best model (using "estim.best_model()"), but also to all the models that were tried ? Is there a way to access the configuration and the test score result from each candidate model that was trained? Thank you for your kindness!
Hi,
We are trying to use hyperopt-sklearn for multilabel classification. However, we are not able to get good performance using hyperopt-sklearn. A simple logistic regression algorithm through scikit-learn performs much better. Is there any insight as to why this might be happening? This is how we are creating our estimators:
hpsklearn.components.one_vs_rest('my_multi_svc', estimator=hpsklearn.components.svc('my_svc')), hpsklearn.components.one_vs_rest('my_multi_liblinear_svc', estimator=hpsklearn.components.liblinear_svc('my_liblinear_svc')), hpsklearn.components.one_vs_rest('my_multi_svc_linear', estimator=hpsklearn.components.svc_linear('my_svc_linear'))
Thanks in advance!
Hi,
I have 2 questions: 1) I would like to know if hyperopt work for multiclass / multilable classification? For example, something like:
estimator = HyperoptEstimator(classifier=OneVsRest(svc('my_est')), algo=tpe.suggest, preprocessing=[], use_partial_fit=True, trial_timeout=timeout)
2) I found that hyperopt is quite slow when the training data is large. I think the parameter 'use_partial_fit' might speed up the fitting process, am I right? Is this the best practice to tell hyperopt not to train the entire training data when it is too large?
Thank you in advance!