When i separate my input into training and holdout sets, how can i be sure that both of these are balanced. Currently I am trying to create box plots of the major variables in training and holdout which I think will be important to try and understand if they have similar characteristics. Is there any other visual or statistic measure to check the "balancedness" of the data set?
When i separate my input into training and holdout sets, how can i be sure that both of these are balanced. Currently I am trying to create box plots of the major variables in training and holdout which I think will be important to try and understand if they have similar characteristics. Is there any other visual or statistic measure to check the "balancedness" of the data set?