uclamii / model_tuner

A library to tune the hyperparameters of common ML models. Supports calibration and custom pipelines.
Apache License 2.0
3 stars 0 forks source link

Stratification Fix for Boolean Ambiguity Error #35

Closed lshpaner closed 1 month ago

lshpaner commented 1 month ago

Summary:

This PR reintroduces the is not None check in the following conditional:

if stratify_cols is not None and stratify_y:

Motivation: The change is necessary to prevent failures during stratification caused by attempting to evaluate the truth value of an array or sequence with multiple elements. Without this check, the following error may occur:

The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Details: The error arises when stratify_cols is an array-like structure or a Pandas DataFrame/Series that contains more than one element. In Python, using such structures in a boolean context without explicit clarification leads to ambiguity because Python does not know whether to evaluate the array as True or False. To resolve this, the is not None check ensures that stratify_cols is explicitly evaluated for existence before proceeding with further boolean operations.

Impact: This update will prevent the application from encountering the truth value ambiguity error during stratification, thereby improving the robustness and reliability of the code when dealing with stratified data. It ensures that the logical checks are handled correctly and prevents unnecessary interruptions in processing due to ambiguous boolean evaluations.

elemets commented 1 month ago

Should be resolved in other branch.