Closed augustebaum closed 2 hours ago
Thanks, I got some issues:
from skore.sklearn.train_test_split import train_test_split
from skore import train_test_split
X = [[1]] * 4
y = [0, 1, 1, 1]
train_test_split(X, y)
https://github.com/user-attachments/assets/80845d61-8dfc-40d4-8307-84b4701e4c80
This PR implements the
train_test_split
function which wraps around sklearn.The warnings are only emitted using
warnings.warn
for now; nothing is saved to a Project, as that part of the design still needs work.In order to validate the technical design one warning is implemented,
HighClassImbalanceWarning
. This inherits from the built-inWarning
class so that it can be used directly inwarnings.warn
.Because we expect warnings to take a wide range of arguments, we pack all the relevant information into a
kwargs
dict insidetrain_test_split
, and let thecheck
static method accept**kwargs
as well as the necessary positional arguments.Closes https://github.com/probabl-ai/skore/issues/683
Todo:
train_test_split(X, test_size=...)
; previously we tookarrays[-1]
asy
, but now we only do this if there are at least 2 arrays_find_ml_task
, as a function of the same name already exists incross_validate.py
?