Open zhangjunli177 opened 5 years ago
Unfortunately, it is not possible to pass default values for a special classifier and we do not plan to add it ourselves in the near future. You can only restrict the searchspace to XGB and let Bayesian optimization do the job.
A hacky workaround would be to subclass the XGB class in autosklearn.pipeline.components.classification.xgradient_boosting
, change the default values of the hyperparameters and then use only this new classifier.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs for the next 7 days. Thank you for your contributions.
@mfeurer I can work on this which will allow to pass default values as well as override range and choices for hyperparameters without the need of subclassing new classifier or overriding package. Let me know if this is still relevant to auto-sklearn scope, I can start with sharing the design and implementation details.
Thanks for picking up on this @rabsr. Let's attack this in two stages:
Thinking about this again, for passing in the default values, we could create an example similar to the random search one and the successive halving one. Instead of replacing the SMAC object by the ROAR object in the 1st example and instead of changing the arguments to the SMAC object in the 2nd example, we would pre-pend the suggested new hyperparameter settings to the metalearning_configurations
that are passed in and forward them to SMAC. What do you think of this? I'd be also happy to hear your suggestions on this.
I was majorly focused on overriding values for hyperparams for algos.. But I can also start by adding an example as suggested for setting up configurations for initial baseline model.
I am considering following approach to pass ranges and defaults for hyperparameters:
override_params = {
'classifier': {
'sgd': {
'loss': {
'choices': ['hinge', 'log'],
'default': 'hinge'
},
'penalty': {
'choices': ['l1'],
},
'alpha': {
'min': 0.005,
'max': 0.1,
'default': 0.01
},
'l1_ratio': {
'min': 0.01,
'max': 0.01
}
},
'random_forest': {...},
},
'feature_preprocessor': {
'pca': {...},
'polynomial': {...}
}
}
automl = autosklearn.classification.AutoSklearnClassifier(
override_hyperparams=override_params
)
All the validations for ranges, hyperparam type, changing any categoical/numerical hyperparam to Constant/Unparameterized depending on input, conditional and forbidden clauses, all can be handled internally. Let me know of your thoughts on this.
For example, I have a XGB model, the prediction is not too bad. I plan to use it as a baseline model, and use it's hyperparams as the initial value for the auto sklearn with restricting search on XGB. Is that possible? There's a parameter called initial_configurations_via_metalearning, but that looks like not for my purpose.