ray-project / tune-sklearn

A drop-in replacement for Scikit-Learn’s GridSearchCV / RandomizedSearchCV -- but with cutting edge hyperparameter tuning techniques.
https://docs.ray.io/en/master/tune/api_docs/sklearn.html
Apache License 2.0
465 stars 52 forks source link

[feature request] sklearn cross_val_score #179

Open r0f1 opened 3 years ago

r0f1 commented 3 years ago

Hi, Often times I want to compare different classifiers on the same dataset and I find myself writing code that looks like this:

list_of_feature_list = [...] # could looks like this: [[0,1,2], [2,4,5], [0,1,2,3,4,5]]

for feature_list in list_of_feature_list:
    X = Xbig[features_list] # features
    y = ...                 # target

    model = LogisticRegression()
    scores = cross_val_score(model, X, y)
    dummy = DummyClassifier(strategy="most_frequent")
    dummys = np.mean(cross_val_score(dummy, X, y))

    print(scores)
    print(dummys)

I was wondering if ray could be used to speed up the process. Or more specifically: Can I use ray to do cross validation and use it instead of cross_val_score()? If not, I think that would be a useful feature to add.

Thank you.

Yard1 commented 3 years ago

I think you could just use grid search with one possible combination of parameters to essentially get the equivalent of sklearn's CV on Ray