Closed ryaanahmed closed 5 years ago
I think we can do a really basic version of this that might be useful quickly by our initial release.
Instead of doing anything involving guessing gender of names in the document texts, we can have FEMININE_WORDS
and MASCULINE_WORDS
be exposed as globals or as arguments to analysis functions that look for gendered words, with the defaults set as ['she', 'her', 'hers'] etc. -- that way users can modify these parameters if they know in advance the names that they're looking for.
We can do something more sophisticated for a later release.
@sophiazhi added MASC_WORDS and FEM_WORDS, which are user-settable. We should now propagate this change into the rest of the codebase, particularly...
done for now via https://github.com/dhmit/gender_analysis/pull/91
Right now all of our functions that analyze gender within the text (as opposed to with the metadata) do so with pronouns only.
Lots of interesting texts don't really use a lot of pronouns (e.g., the congressional hearings that @meesuekim was working with), but they have everyone named.
This might not be possible before initial release, but we should work on it eventually.