ETA444 / datasafari

DataSafari simplifies complex data science tasks into straightforward, powerful one-liners.
https://datasafari.dev
GNU General Public License v3.0
2 stars 0 forks source link

Construct tests for evaluate_variance() #77

Closed ETA444 closed 5 months ago

ETA444 commented 5 months ago

Summary of Unit Tests for evaluate_variance()

The evaluate_variance() function is used to evaluate the homogeneity of variances across groups within a dataset, employing different statistical tests like Levene's, Bartlett's, and Fligner-Killeen. The unit tests ensure that the function accurately handles errors and validates assumptions regarding input types and values, as well as confirming the correctness of its statistical computations.

Detailed Breakdown of Tests

Error-Handling Tests

  1. Non-DataFrame Input:

    • Validates that a TypeError is raised when the input is not a DataFrame.
  2. Non-String Target Variable:

    • Checks for a TypeError if the target variable is not a string.
  3. Non-String Grouping Variable:

    • Ensures a TypeError is raised for non-string grouping variables.
  4. Non-Boolean Normality Info:

    • Confirms that a TypeError is triggered by non-boolean normality info inputs.
  5. Non-String Method:

    • Validates that a TypeError is raised for non-string method inputs.
  6. Non-Boolean Pipeline:

    • Checks for a TypeError when the pipeline parameter is not a boolean.
  7. Empty DataFrame:

    • Ensures handling of an empty DataFrame by raising a ValueError.
  8. Non-Existent Target Variable:

    • Confirms that a ValueError is raised for non-existent target variables.
  9. Non-Existent Grouping Variable:

    • Ensures a ValueError is raised for non-existent grouping variables.
  10. Invalid Method Input:

    • Checks for a ValueError when an unsupported method is specified.
  11. Invalid Target Variable Type:

    • Validates that a ValueError is raised when the target variable is not numerical.
  12. Invalid Grouping Variable Type:

    • Ensures a ValueError is raised when the grouping variable is not categorical.

Functionality Tests

  1. Levene Test:

    • Tests the functionality of the Levene test for evaluating variance homogeneity.
  2. Bartlett Test:

    • Evaluates the Bartlett test, particularly when normality is assumed.
  3. Fligner Test:

    • Tests the Fligner-Killeen test, which does not assume normal distribution.
  4. Consensus Method:

    • Confirms that the consensus method integrates results from multiple tests correctly.
  5. Pipeline Mode:

    • Ensures that the pipeline mode returns a simple boolean indicating consensus on variance equality.

Example Code from the Suite

Here's an example test code snippet for handling a "Non-DataFrame Input":

def test_evaluate_variance_non_dataframe_input():
    """ Test that non-DataFrame input raises a TypeError. """
    with pytest.raises(TypeError, match="The 'df' parameter must be a pandas DataFrame."):
        evaluate_variance("not a dataframe", 'Data', 'Group')

This test ensures that the function raises an appropriate error when it receives an input that is not a pandas DataFrame, preventing further execution that could lead to incorrect evaluations or crashes.

Full Test Suite Access

For comprehensive details and to explore each test's implementation, access the full test suite here: Evaluate Variance Test Suite.