probabl-ai / skore

Skore lets you "Own Your Data Science." It provides a user-friendly interface to track and visualize your modeling results, and perform evaluation of your machine learning models with scikit-learn.
https://probabl-ai.github.io/skore/
MIT License
68 stars 7 forks source link

cross_validate's signature, location, and purpose #731

Open adrinjalali opened 3 days ago

adrinjalali commented 3 days ago

Right now the signature is

cross_validate(*args, project, **kwargs)

This doesn't play well with auto-complete and IDEs and inspection. As a user it would be much nicer if the signature matches the scikit-learn's signature.

Another thing which would be nice is the import location. The nicest thing as a user would be to have to change only the top level import, as:

from sklearn.model_selection import cross_validate

to

from skore.model_selection import cross_validate

However, I wonder if it's a good pattern where we'd then need to basically patch a bunch of sklearn methods to add things to them or change their default values.

To me, it would be much more natural to get the output of sklearn's cross_validate, the estimator, and the data, and then generate what we need to show to the user, instead of calculating all of those on the fly in a custom implementation of cross_validate.

So in reality, I would remove this implementation, and have something like:

from sklearn.model_selection import cross_validate
cv_results = cross_validate(..., return_estimator=True, return_indices=True)
project.put(X=x, y=y)
project.put("cv_result", cv_results=cv_results)
# We could generate from stored objects (names here), or actual values.
project.generate_scores_from(X="X", y="y", cv_results="cv_results")

cc @koaning @glemaitre

adrinjalali commented 2 days ago

The docs also don't explain what really this function does: https://probabl-ai.github.io/skore/latest/generated/skore.cross_validate.html