ray-project / tune-sklearn

A drop-in replacement for Scikit-Learn’s GridSearchCV / RandomizedSearchCV -- but with cutting edge hyperparameter tuning techniques.
https://docs.ray.io/en/master/tune/api_docs/sklearn.html
Apache License 2.0
467 stars 51 forks source link

Is it possible to save all models when doing TuneSearchCV or equivalent? #280

Open 8bit-pixies opened 1 year ago

8bit-pixies commented 1 year ago

Hello,

Is it possible to save the final model across all the parameter choices? Not just the final one?

e.g.

model = lgb.LGBMClassifier()
param_dists = {...}

gs = TuneSearchCV(
    model,
    param_dists,
    scoring="accuracy",
    local_dir="experiment_results",
)  # maybe an option here?

gs.fit(X_train, y_train)

Then I would expect to have the model artifact under experiment_results/_Trainable*/model.joblib or something similar

I can understand this not being the default option when dealing with large models, but when working with small tabular models this would be very useful.

I can try doing a PR as well if someone can point me in the right direction.

I'm guessing this is more to do with a limitation of wrapping around scikit-learn than tune-sklearn though - is this correct?


If the above isn't possible, maybe something sensible that does like:

# gs.checkpoint_save()?
joblib.dump(gs.best_estimator, "<path/to/appropriate_checkpoint_derived_from_the_best_estimator_info/model.joblib")

Would make a world of difference - that way we can run fit and save the artifact appropriately, rather than manually figuring it out.