probabl-ai / skore

Skore lets you "Own Your Data Science." It provides a user-friendly interface to track and visualize your modeling results, and perform evaluation of your machine learning models with scikit-learn.
https://probabl-ai.github.io/skore/
MIT License
70 stars 7 forks source link

feat: Implement basic `train_test_split` #690

Closed augustebaum closed 2 hours ago

augustebaum commented 1 week ago

This PR implements the train_test_split function which wraps around sklearn.

The warnings are only emitted using warnings.warn for now; nothing is saved to a Project, as that part of the design still needs work.

In order to validate the technical design one warning is implemented, HighClassImbalanceWarning. This inherits from the built-in Warning class so that it can be used directly in warnings.warn.

Because we expect warnings to take a wide range of arguments, we pack all the relevant information into a kwargs dict inside train_test_split, and let the check static method accept **kwargs as well as the necessary positional arguments.

Closes https://github.com/probabl-ai/skore/issues/683


Todo:

sylvaincom commented 4 days ago

Thanks, I got some issues:

from skore.sklearn.train_test_split import train_test_split
X = [[1]] * 4
y = [0, 1, 1, 1]
train_test_split(X, y)

https://github.com/user-attachments/assets/80845d61-8dfc-40d4-8307-84b4701e4c80