Implement error handling for predict_hypothesis()

Implementation Summary

The function predict_hypothesis() automates the selection and execution of hypothesis tests based on the characteristics of two input variables within a DataFrame. It intelligently determines whether categorical or numerical hypothesis tests are appropriate, assesses necessary assumptions, and conducts the tests, providing detailed outcomes.

Detailed Error Handling Breakdown

Type Validations

DataFrame Type Check
- Ensures that the input df is a pandas DataFrame, which is essential for data manipulation and access throughout the function.

if not isinstance(df, pd.DataFrame):
    raise TypeError("predict_hypothesis(): The 'df' parameter must be a pandas DataFrame.")

String Type Checks for Variables
- Verifies that var1 and var2 are strings, as they are expected to reference column names in the DataFrame.

if not isinstance(var1, str) or not isinstance(var2, str):
    raise TypeError("predict_hypothesis(): The 'var1' and 'var2' parameters must be strings.")

String Type Checks for Method Parameters
- Ensures that normality_method, variance_method, and exact_tests_alternative are strings. These parameters dictate the methodology for evaluating assumptions and the direction of hypothesis tests.

if not isinstance(normality_method, str):
    raise TypeError("predict_hypothesis(): The 'normality_method' parameter must be a string.")
if not isinstance(variance_method, str):
    raise TypeError("predict_hypothesis(): The 'variance_method' parameter must be a string.")
if not isinstance(exact_tests_alternative, str):
    raise TypeError("predict_hypothesis(): The 'exact_tests_alternative' parameter must be a string.")

Integer Type Check for Sample Size Parameter
- Confirms that yates_min_sample_size is an integer, crucial for determining the application of Yates' correction.

if not isinstance(yates_min_sample_size, int):
    raise TypeError("predict_hypothesis(): The 'yates_min_sample_size' parameter must be an integer.")

Value Validations

Data Presence Check
- Checks that the DataFrame is not empty to ensure there is data available for hypothesis testing.

if df.empty:
    raise ValueError("model_recommendation_core_inference(): The input DataFrame is empty.")

Method Parameter Validations
- Verifies that normality_method, variance_method, and exact_tests_alternative are within their respective valid options. This step is crucial for directing the function to use appropriate evaluation methods and hypothesis test configurations.

valid_normality_methods = ['shapiro', 'anderson', 'normaltest', 'lilliefors', 'consensus']
if normality_method.lower() not in valid_normality_methods:
    raise ValueError(f"predict_hypothesis(): Invalid 'normality_method' value. Expected one of {valid_normality_methods}, got '{normality_method}'.")

valid_variance_methods = ['levene', 'bartlett', 'fligner', 'consensus']
if variance_method.lower() not in valid_variance_methods:
    raise ValueError(f"predict_hypothesis(): Invalid 'variance_method' value. Expected one of {valid_variance_methods}, got '{variance_method}'.")

valid_alternatives = ['two-sided', 'less', 'greater']
if exact_tests_alternative.lower() not in valid_alternatives:
    raise ValueError(f"predict_hypothesis(): Invalid 'exact_tests_alternative' value. Expected one of {valid_alternatives}, got '{exact_tests_alternative}'.")

Sample Size Check
- Ensures that yates_min_sample_size is greater than zero, a necessary condition for the application of Yates' correction.

if yates_min_sample_size < 1:
    raise ValueError("predict_hypothesis(): The 'yates_min_sample_size' must be at least 1.")

ETA444 / datasafari

Implement error handling for predict_hypothesis() #48

Implementation Summary

Detailed Error Handling Breakdown

Type Validations

Value Validations

You can find the full code here