Closed kienerj closed 3 years ago
What arguments are you running TuneSearchCV with?
self.tune_search = TuneSearchCV(clf,
param_distributions=self.params,
n_trials=iterations,
early_stopping=True, # uses Async HyperBand if set to True
max_iters=10,
search_optimization="optuna",
cv=5,
scoring=self.scorer,
mode=self.metric_mode
)
scorer in this case is just "accuracy".
And the parameters are:
self.params = {
"classification__n_estimators": tune.qrandint(self.min_rounds, self.max_rounds, 10),
"classification__max_depth": tune.randint(self.min_max_depth, self.max_max_depth),
"classification__min_child_weight": tune.randint(1, 4),
"classification__subsample": tune.quniform(0.5, 1.0, 0.1),
"classification__eta": tune.qloguniform(self.min_learning_rate, self.max_learning_rate, 1e-3)
}
Does it work with early stopping set to False?
Actually now it also works with same settings as above. Of course due to restart and everything the internal CV split will be different which in my opinion is the cause for the issue. Right now I can't reproduce the issue even with early_stopping=True.
Can you set random seeds to some constant value?
I can but seems like a chicken egg problem, First I need a seed that triggers the issue.
I'll close this for now, @kienerj. Feel free to reopen if the issue persists.
I have scikit-learn pipeline I want to use for parameter optimization:
xgb = and xgboost classifier.
When I run this code I get an error:
Without the pipeline, just the model, it works fine.
It is clear that the pipeline may return a different number of features per CV split. I suspect somewhere in the code the assumption is made that the feature count is constant? Or if a feature is removed for a fold in the training set it then stays removed when said fold becomes the test set?
I'm saying above because the exact same pipline works perfectly fine when running it using Optuna directly without ray. So it seems there is some form of optimization going on in ray/tune-sklearn that leads to this problem.
Any advice or idea how to solve this?