Implement new evaluate_variance() method: 'consensus'

Title: Implementing Consensus Method for Variance Homogeneity Testing

Description: This update introduces a consensus method for evaluating variance homogeneity across groups in a dataset. The consensus method aggregates results from multiple variance tests, including Levene's, Fligner-Killeen's, and Bartlett's tests, to reach a robust conclusion regarding the equality of variances. This approach enhances the reliability of variance homogeneity assessments, especially in scenarios where individual tests may produce conflicting results.

Example Usage:

import pandas as pd
import numpy as np

# Load example dataset
data = {
    'Group': np.random.choice(['A', 'B', 'C'], 100),
    'Data': np.random.normal(0, 1, 100)
}
df = pd.DataFrame(data)

# Evaluate variance homogeneity using the consensus method
variance_homogeneity = evaluate_variance(df, 'Data', 'Group', method='consensus')

Expected Outcome: By leveraging the consensus method, users can obtain a robust assessment of variance homogeneity across different groups in the dataset. This approach considers the collective results of multiple variance tests, providing a more comprehensive and reliable determination of variance equality.

Additional Context: The introduction of the consensus method enhances the variance homogeneity evaluation module by offering a consolidated approach to interpreting variance test results. This method ensures greater confidence in the assessment of variance homogeneity, facilitating more informed decision-making in statistical analyses and hypothesis testing.

Implementation Summary:

The 'consensus' method evaluates variance homogeneity by combining results from Levene, Bartlett, and Fligner-Killeen tests. It utilizes a majority rule approach to conclude if the variances are homogeneous or not. This method provides a robust determination of homogeneity, especially useful for automated analysis pipelines.

Code Breakdown:

Method Header:

Purpose: Introduce the start of the 'consensus' method implementation, clarifying the logic behind the consensus approach.

if method == 'consensus':
   # the logic is that more than half of the tests need to give True for the consensus to be True
   variance_results = [variance_bool for variance_bool in variance_info.values()]

Count True and False Results:

Purpose: Count the number of tests that conclude equal and unequal variances.

true_count = variance_results.count(True)
false_count = variance_results.count(False)
half_point = 1.5 if len(variance_results) == 3 else 1

Consensus Result Calculation:

Purpose: Determine the consensus result based on the majority of test outcomes.

if true_count > half_point:
   consensus_percent = (true_count / len(variance_results)) * 100
   variance_consensus_text = f"  ➡ Result: Consensus is reached.\n  ➡ {consensus_percent}% of tests suggest equal variance between samples. *\n\n* Detailed results of each test are provided below.\n"
   variance_consensus = True

elif true_count < half_point:
   consensus_percent = (false_count / len(variance_results)) * 100
   variance_consensus_text = f"  ➡ Result: Consensus is reached.\n  ➡ {consensus_percent}% of tests suggest unequal variance between samples. *\n\n* Detailed results of each test are provided below:\n"
   variance_consensus = False

elif true_count == half_point:
   variance_consensus_text = f"  ➡ Result: Consensus is not reached.\n\n∴ Please refer to the results of each test below:\n"
   variance_consensus = variance_info['levene']

Output Results:

Purpose: Display the console output and handle the return based on the pipeline flag.

print(f"< VARIANCE TESTING: CONSENSUS >\nThe consensus method bases its conclusion on 2-3 tests: Levene test, Fligner-Killeen test, Bartlett test. (Note: More than 50% must have the same outcome to reach consensus.)\n\n{variance_consensus_text}")
print(levene_title, levene_text, levene_tip)
print(fligner_title, fligner_text, fligner_tip)
print(bartlett_title, bartlett_text, bartlett_tip) if normality_info else f"\n\n< NOTE ON BARTLETT >\nBartlett was not used in consensus as no normality info has been provided or data is non-normal. Accuracy of Bartlett's test results rely heavily on normality."

return output_info if not pipeline else variance_consensus

Link to Full Code: evaluate_variance.py

ETA444 / datasafari

Implement new evaluate_variance() method: 'consensus' #69