Construct tests for hypothesis_predictor_core_n()

Summary of Unit Tests for `hypothesis_predictor_core_n()`

The function hypothesis_predictor_core_n() is designed to conduct statistical hypothesis testing to compare the means across groups for a given target variable in a dataset. It supports various tests like the t-test, ANOVA, Mann-Whitney U test, and Kruskal-Wallis H-test based on the data characteristics. The unit tests are structured to ensure robust error handling and validate the correctness of statistical analyses performed by the function.

Detailed Breakdown of Tests

Error-Handling Tests

Non-DataFrame Input:
- Confirms a TypeError is raised when the input is not a DataFrame.
Non-String Target Variable:
- Checks for a TypeError if the target variable is not a string identifier.
Non-String Grouping Variable:
- Ensures a TypeError is raised for non-string grouping variables.
Non-Boolean Normality and Equal Variances Flags:
- Validates that a TypeError is raised when boolean flags (normality_bool and equal_variances_bool) are not boolean values.
Empty DataFrame:
- Tests that a ValueError is raised when the input DataFrame is empty.
Non-Existent Target or Grouping Variable:
- Ensures that a ValueError is raised when specified variables are not found in the DataFrame.
Non-Numerical Target Variable:
- Checks for a ValueError when the target variable data type is not numerical.
Non-Categorical Grouping Variable:
- Verifies that a ValueError is raised when the grouping variable is not categorical.

Functionality Tests

T-Test for Two Groups:
- Tests the independent samples t-test functionality for normal distributions.
Mann-Whitney U Test:
- Verifies the Mann-Whitney U test for non-normal distributions when comparing two groups.
One-Way ANOVA:
- Assesses the ANOVA test for comparing more than two groups assuming equal variances and normality.
Kruskal-Wallis Test:
- Evaluates the Kruskal-Wallis test for comparing more than two groups without assuming normal distributions.
Handling More Than Three Groups with ANOVA:
- Tests the ANOVA's ability to handle comparisons involving more than three groups.

Example Code from the Suite

Here's an example of a test that checks the handling of non-DataFrame inputs:

def test_hypothesis_predictor_core_n_non_dataframe_input():
    """ Test that non-DataFrame input raises a TypeError. """
    with pytest.raises(TypeError, match="pandas DataFrame"):
        hypothesis_predictor_core_n("not_a_dataframe", 'Value', 'Group', True, True)

This test ensures that the function properly checks for DataFrame inputs, thus safeguarding against type errors early in the function execution.

Full Test Suite Access

For a comprehensive view of all tests and their implementations, you can access the full test suite here: Hypothesis Predictor Core N Test Suite.

ETA444 / datasafari