Issue: Enhancement Request for stratify_y=True in Regression Tasks
Description:
The current functionality of stratify_y=True works perfectly for binary classification, where the target values are not continuous and typically binary (0 or 1). However, when used in regression, where the target values are continuous, the same input parameter leads to an error. Specifically, it throws the following exception:
ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.
This occurs because the stratify parameter is designed for categorical target values, not continuous ones.
Proposed Enhancement:
For regression tasks where continuous target values are provided, it would be helpful to either:
Throw a clearer exception: Provide a more informative error message guiding users on the next steps. For example:
ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2. Bin your continuous y values into categories using pandas.qcut (for quantiles) or pandas.cut (for custom bins) to ensure the same distribution of bins among split sets of data.
Issue: Enhancement Request for
stratify_y=True
in Regression TasksDescription: The current functionality of
stratify_y=True
works perfectly for binary classification, where the target values are not continuous and typically binary (0 or 1
). However, when used in regression, where the target values are continuous, the same input parameter leads to an error. Specifically, it throws the following exception:This occurs because the
stratify
parameter is designed for categorical target values, not continuous ones.Proposed Enhancement: For regression tasks where continuous target values are provided, it would be helpful to either: