Follows up on PR #151 (the separation of all corpus and document functionality into its own package) by reorganizing all corpus_analysis and gender_analysis modules into a more conventional Python architecture. This means all packages are now included in a single base directory (gender_analysis), I've renamed a few of our packages to follow the Python styleguide recommendations (link), and common testing files (including common variables and text files) have been moved into their own directory (testing) while package-specific testing has remained with their associated packages. The Gender Analysis Toolkit therefore will consist of three functional packages: text, gender, and analysis, along with a testing package.
Updates our __init__.py files to (TBD).
Deletes gender_adjectives.py. This module has been fully replaced by the proximity module (PR #159).
One upshot of this change is that we are now linting all of our modules correctly in GitHub. Previously, because corpus_analysis was not included in the gender_analysis directory, it was not being linted in our GitHub actions. This change resolved that, and identified a number of linting changes that could be made. I made some and skipped others as seemed most relevant.
Overview
This PR does three things:
corpus
anddocument
functionality into its own package) by reorganizing allcorpus_analysis
andgender_analysis
modules into a more conventional Python architecture. This means all packages are now included in a single base directory (gender_analysis
), I've renamed a few of our packages to follow the Python styleguide recommendations (link), and common testing files (including common variables and text files) have been moved into their own directory (testing
) while package-specific testing has remained with their associated packages. The Gender Analysis Toolkit therefore will consist of three functional packages:text
,gender
, andanalysis
, along with atesting
package.__init__.py
files to (TBD).gender_adjectives.py
. This module has been fully replaced by theproximity
module (PR #159).One upshot of this change is that we are now linting all of our modules correctly in GitHub. Previously, because
corpus_analysis
was not included in thegender_analysis
directory, it was not being linted in our GitHub actions. This change resolved that, and identified a number of linting changes that could be made. I made some and skipped others as seemed most relevant.Directory structure before
Directory structure after