AxeldeRomblay / MLBox

MLBox is a powerful Automated Machine Learning python library.
https://mlbox.readthedocs.io/en/latest/
Other
1.49k stars 274 forks source link

Number of test samples : 1 #103

Closed alexnix closed 4 years ago

alexnix commented 4 years ago

I use a train (12k rows) and a test (4k rows) file. I read them like this:

paths = ["Fields_train.csv", "Fields_test.csv"] target_name = "price" rd = Reader(sep = ',') df = rd.train_test_split(paths, target_name)

But the output says there is only one test sample.... This causes: "Only one class present in y_true. ROC AUC score is not defined in that case." when doing

dft = Drift_thresholder() df = dft.fit_transform(df)

PS: I am experimenting with a regression task...

AxeldeRomblay commented 4 years ago

Hello @alexnix, The test set is detected as the set where the target is missing (either no column or the values are missing...). If you still want to predict on this specific set, you have to remove the "price" feature from it. Hope it helps !

alexnix commented 4 years ago

Indeed this was the issue, it worked when I removed the price column from the test data set. Thanks!