Implement error handling for hypothesis_predictor_core_n()

Implementation Summary

The function hypothesis_predictor_core_n() is designed to conduct hypothesis tests for numerical data grouped by a categorical variable. It chooses between parametric and non-parametric tests based on the normality of the data and equality of variances across groups. The function supports a variety of tests including t-tests, Mann-Whitney U tests, ANOVA, and Kruskal-Wallis tests. Proper error handling ensures that the function operates smoothly by validating the types and values of all inputs.

Detailed Error Handling Breakdown

Type Validations

DataFrame Type Check
- Ensures that the df is a pandas DataFrame. This validation is essential for the function to perform operations using DataFrame methods.

if not isinstance(df, pd.DataFrame):
    raise TypeError("predictor_core_numerical(): The 'df' parameter must be a pandas DataFrame.")

String Type Checks for Variables
- Checks that both target_variable and grouping_variable are strings, as they are expected to specify column names within the DataFrame.

if not isinstance(target_variable, str):
    raise TypeError("predictor_core_numerical(): The 'target_variable' must be a string.")
if not isinstance(grouping_variable, str):
    raise TypeError("predictor_core_numerical(): The 'grouping_variable' must be a string.")

Boolean Type Checks for Viability Flags
- Ensures that normality_bool and equal_variances_bool are booleans. These parameters dictate which statistical tests are appropriate, influencing the function's behavior significantly.

if not isinstance(normality_bool, bool):
    raise TypeError("predictor_core_numerical(): The 'normality_bool' must be a boolean.")
if not isinstance(equal_variances_bool, bool):
    raise TypeError("predictor_core_numerical(): The 'equal_variances_bool' must be a boolean.")

Value Validations

Data Presence Check
- Checks that the DataFrame is not empty, ensuring that there are data available for analysis.

if df.empty:
    raise ValueError("predictor_core_n(): The input DataFrame is empty.")

Column Existence Check
- Verifies the presence of target_variable and grouping_variable in the DataFrame. This step is crucial as these columns are needed for grouping and analysis.

if target_variable not in df.columns:
    raise ValueError(f"predictor_core_n(): The target variable '{target_variable}' was not found in the DataFrame.")
if grouping_variable not in df.columns:
    raise ValueError(f"predictor_core_n(): The grouping variable '{grouping_variable}' was not found in the DataFrame.")

Data Type Check for Target Variable
- Ensures that the target_variable is numerical, which is necessary for the statistical tests that will be applied.

target_variable_is_numerical = evaluate_dtype(df, [target_variable], output='list_n')[0]
if not target_variable_is_numerical:
    raise ValueError(f"predictor_core_n(): The target variable '{target_variable}' must be a numerical variable.")

Data Type Check for Grouping Variable
- Ensures that the grouping_variable is categorical, as it is used to define groups for comparison.

grouping_variable_is_categorical = evaluate_dtype(df, [grouping_variable], output='list_c')[0]
if not grouping_variable_is_categorical:
    raise ValueError(f"predictor_core_n(): The grouping variable '{grouping_variable}' must be a categorical variable.")

ETA444 / datasafari