software-gardening / almanack

An open-source handbook of applied guidance and tools for sustainable software development and maintenance.
BSD 3-Clause "New" or "Revised" License
0 stars 2 forks source link

Creating Shannon Entropy calculation function and editing test datasets #65

Open willdavidson05 opened 5 days ago

willdavidson05 commented 5 days ago

Description

This PR introduces an entropy.py module that computes the Shannon Entropy, based on the output from (calculate_loc_changes) in git_parser.py.

Within the module, safeguards were used to ensure that probability values cannot be negative. Additionally, a test suite test_entropy.py has been created to validate against negative outputs.

The test datasets were edited to have multiple files in a repository, this is done in the high_entropy repo. To support repositories with multiple files, modifications were made to (calculate_loc_changes) in git_parser.py. It now accepts an additional parameter, file_names: list[str].

The new test repository structure is as follows:

I appreciate any comments or feedback!

Closes #62 , #64

What is the nature of your change?

Checklist

Please ensure that all boxes are checked before indicating that this pull request is ready for review.