probabl-ai / skore

Skore lets you "Own Your Data Science." It provides a user-friendly interface to track and visualize your modeling results, and perform evaluation of your machine learning models with scikit-learn.
https://probabl-ai.github.io/skore/
MIT License
82 stars 7 forks source link

The cross-validate example with hyperparameter tuning looks like an anti-pattern #821

Closed glemaitre closed 2 days ago

glemaitre commented 3 days ago

I was looking at this example and the following block:

from sklearn import datasets
from sklearn.linear_model import Lasso
import skore

diabetes = datasets.load_diabetes()
X = diabetes.data[:150]
y = diabetes.target[:150]
lasso = Lasso()

for alpha in [0.5, 1, 2]:
    cv_results = skore.cross_validate(
        Lasso(alpha=alpha), X, y, cv=5, project=my_project
    )

The fact that we do an hyperparemeter search here looks really weird to me. I would request people to use a RandomizedSearchCV or a GridSearchCV so I'm not sure that the example is relevant anymore.

One issue that you end-up with the current example is that the generic cross_validate does not intend to track hyperparameters: it means as a user, I'll need to store those in the order of computation as well.

Another point is about data splitting: since no random state is set, each parameter are evaluated on really different sets. The SearchCV will report results on consistent splits even when the random state is not set (if I'm not mistaken).

A more natural example would be that I have a model where the hyperparameters are set and I get a fresh batch of data and I want to look if I got any drift in the statistical performance. It looks to me a more appealing and real use case than the current example. Edit: For this point, it means that somehow some feature engineering in the preprocessing stage get invalidating and that the craft model will start to not work anymore.

NB: I would put the import before from. I think this is something isort would do. NB2: lass = Lasso() is unused. NB3: you can load directly X and y with X, y = datasets.load_diabetes(return_X_y=True).

sylvaincom commented 3 days ago

Hi @glemaitre, thanks for your message, agreed

Yeah I had noted that cross-validation historization often amounts to grid search, but could not find a more compelling example since it would not make sense to historicize several cross-validation from different estimators... So I thought that the historization of cross-validation feature is useful when users would write several cross-validations in a draft notebook to iterate, and we end up displaying them in a nicer display that amounts to a grid search (but good practice from the users would have been to do grid search from the start)

cc @MarieS-WiMLDS