Closed Apollo0801 closed 2 years ago
Because of undersampling.
We first split the entire set into train and test at lines 54 to 65. The ratio of train and test should be around 8:2.
At line 68, if under_sampling_train
is set to True
, we balance the train set by undersampling.
That is the reason why the final train set is smaller than test set.
Because of undersampling.
We first split the entire set into train and test at lines 54 to 65. The ratio of train and test should be around 8:2.
At line 68, if
under_sampling_train
is set toTrue
, we balance the train set by undersampling.That is the reason why the final train set is smaller than test set.
Thank you very much for your answer.
Why set _testsize= 0.2 in _create_train_testset.py, but the resulting data set (training set: Test Set) is not (8:2). Moreover, the sample size of the test set is much higher than that of the training set.