Closed samimak37 closed 3 years ago
This is resolved for our word-window module (proximity
), but not yet for gender_frequency
.
Thanks for identifying this opportunity! PR #169 resolves this for gender_frequency
, and PR #168 will resolve it for instance_distance
. These will both be in place by next Monday, so I'm closing this issue.
As it stands right now, many of the analysis modules have "author_gender", "location", and "year" functions as a means of grouping the result data together (see these three)
https://github.com/dhmit/gender_analysis/blob/e790107035c536f02e9810fa3b05c1131b9f4de2/gender_analysis/analysis/gender_frequency.py#L429
https://github.com/dhmit/gender_analysis/blob/e790107035c536f02e9810fa3b05c1131b9f4de2/gender_analysis/analysis/gender_frequency.py#L383
https://github.com/dhmit/gender_analysis/blob/e790107035c536f02e9810fa3b05c1131b9f4de2/gender_analysis/analysis/gender_frequency.py#L476
All three of these functions have relatively similar implementations across all of the modules, and the choice of metadata that they specify feels somewhat arbitrary. We should create a general version of these functions where a user can perform these analyses on any metadata field that they create.