Closed ETA444 closed 6 months ago
Implementation Summary:
The 'lilliefors'
method in the evaluate_normality()
function applies the Lilliefors test, which is an adaptation of the Kolmogorov-Smirnov test for normality. This test is especially useful for small to moderately sized samples and does not require the mean and variance to be known parameters. It is particularly sensitive to deviations from normality in the center of the distribution.
Code Breakdown:
Calculate Lilliefors Statistic and P-values:
lilliefors_stats = [
lilliefors(df[df[grouping_variable] == group][target_variable])[0]
for group in groups
]
lilliefors_pvals = [
lilliefors(df[df[grouping_variable] == group][target_variable])[1]
for group in groups
]
target_variable
.lilliefors_stats
and lilliefors_pvals
.Determine Normality:
lilliefors_normality = [p > 0.05 for p in lilliefors_pvals]
True
indicates normality, False
indicates non-normality).Prepare Output:
lilliefors_info = {
group: {
'stat': lilliefors_stats[n],
'p': lilliefors_pvals[n],
'normality': lilliefors_normality[n]
} for n, group in enumerate(groups)
}
lilliefors_text = [
f"Results for '{key}' group in variable ['{target_variable}']:\n ➡ statistic: {value['stat']}\n ➡ p-value: {value['p']}\n{(f' ∴ Normality: Yes (H0 cannot be rejected)' if value['normality'] else f' ∴ Normality: No (H0 rejected)')}\n\n"
for key, value in lilliefors_info.items()
]
lilliefors_title = f"< NORMALITY TESTING: LILLIEFORS' TEST >\n\n"
lilliefors_tip = "☻ Tip: The Lilliefors test is an adaptation of the Kolmogorov-Smirnov test for normality with the benefit of not requiring the mean and variance to be known parameters. It's particularly useful for small to moderately sized samples and is sensitive to deviations from normality in the center of the distribution rather than the tails. This makes it complementary to tests like the Anderson-Darling when a comprehensive assessment of normality is needed.\n"
lilliefors_info
holds the results for each group, with group names as keys and dictionaries containing the test statistic, p-value, and normality conclusion as values.lilliefors_text
list formats these results for each group.lilliefors_title
and lilliefors_tip
are used for console output headers and tips.Output Results and Return:
pipeline
parameter.# saving info
output_info['lilliefors'] = lilliefors_info
normality_info['lilliefors_group_consensus'] = all(lilliefors_normality)
# end it here if non-consensus method
if method == 'lilliefors':
print(lilliefors_title, *lilliefors_text, lilliefors_tip)
return output_info if not pipeline else normality_info['lilliefors_group_consensus']
output_info
and normality_info
dictionaries.pipeline
flag.Link to Full Code: evaluate_normality.py.
Title: Introducing Lilliefors' Test for Normality Assessment
Description: This update incorporates the Lilliefors test into the normality testing module, providing users with an additional tool for assessing the normality of their data. Lilliefors' test is an adaptation of the Kolmogorov-Smirnov test specifically designed for normality testing, offering the advantage of not requiring knowledge of the mean and variance parameters. It is particularly well-suited for small to moderately sized samples and is sensitive to deviations from normality in the center of the distribution. The inclusion of Lilliefors' test enhances the module's capability to detect deviations from normality, providing users with a comprehensive assessment of their data's distribution.
Example Usage:
Expected Outcome: By leveraging Lilliefors' test, users can assess the normality of their data with increased accuracy, especially in scenarios involving small to moderately sized samples. Lilliefors' test is sensitive to deviations from normality in the center of the distribution, making it complementary to other normality tests. The integration of Lilliefors' test enhances the normality testing module, providing users with a more comprehensive toolkit for evaluating the distribution of their data.
Additional Context: The addition of Lilliefors' test to the normality testing module enhances its versatility and reliability. Lilliefors' test offers a practical solution for assessing normality in datasets where the mean and variance parameters are not known. Its sensitivity to deviations from normality in the central part of the distribution makes it a valuable asset in the normality assessment process. This update underscores our commitment to providing users with robust and comprehensive tools for statistical analysis and hypothesis testing.