ETA444 / datasafari

DataSafari simplifies complex data science tasks into straightforward, powerful one-liners.
https://datasafari.dev
GNU General Public License v3.0
2 stars 0 forks source link

Implement error handling for evaluate_normality() #80

Closed ETA444 closed 6 months ago

ETA444 commented 6 months ago

Implementation Summary

evaluate_normality() performs rigorous error handling to ensure the input data and parameters are valid for conducting normality tests. It provides clear error messages to guide the user in rectifying common input mistakes, enhancing the robustness and usability of the function.

Detailed Error Handling Breakdown

Data Validation

if not isinstance(df, pd.DataFrame):
    raise TypeError("evaluate_normality(): The 'df' parameter must be a pandas DataFrame.")
if target_variable not in df.columns:
    raise ValueError(f"evaluate_normality(): The target variable '{target_variable}' was not found in the DataFrame.")
if grouping_variable not in df.columns:
    raise ValueError(f"evaluate_normality(): The grouping variable '{grouping_variable}' was not found in the DataFrame.")

Parameter Type Validation

if not isinstance(target_variable, str) or not isinstance(grouping_variable, str):
    raise TypeError("evaluate_normality(): The 'target_variable' and 'grouping_variable' parameters must be strings.")
if not isinstance(method, str):
    raise TypeError("evaluate_normality(): The 'method' parameter must be a string.")
if not isinstance(pipeline, bool):
    raise TypeError("evaluate_normality(): The 'pipeline' parameter must be a boolean.")

Content Validation

if df.empty:
    raise ValueError("evaluate_normality(): The input DataFrame is empty.")
if not evaluate_dtype(df, [target_variable], output='list_n')[0]:
    raise ValueError(f"evaluate_normality(): The target variable '{target_variable}' must be a numerical variable.")
if not evaluate_dtype(df, [grouping_variable], output='list_c')[0]:
    raise ValueError(f"evaluate_normality(): The grouping variable '{grouping_variable}' must be a categorical variable.")
allowed_methods = ['shapiro', 'anderson', 'normaltest', 'lilliefors', 'consensus']
if method not in allowed_methods:
    raise ValueError(f"evaluate_normality(): The method '{method}' is not supported. Allowed methods are: {', '.join(allowed_methods)}.")

Link to Full Code