The character class in the gender_analysis toolkit provides the functionality to automatically generate a character list with each character’s name, nicknames, and pronouns based on a particular document input and intake user feedback for a manually disambiguated list. The pipeline utilizes a human-AI collaboration approach that includes NLTK’s Named Entity Recognition (NER) and Neuralcoref’s Coreference Resolution model as well as a manual disambiguation interface. For the gender analysis web interface, we’d like to build a frontend that achieves the core functionality of the pipeline:
MVP:
[x] A user selects a document through leveraging our document model
[ ] The backend pipeline automatically output a list of character names with their associated nicknames and pronoun probabilities based on THIS_NOTEBOOK
[ ] A frontend disambiguation interface that enables the user to validate and correct the pipeline outputs through a dropdown list design (or similar)
Nice-to-have:
[ ] Output a resolved text with the results from the character identification-disambiguation pipeline
[ ] Take the resolved text for further analysis similar to proximity analysis and frequency analysis
The character class in the gender_analysis toolkit provides the functionality to automatically generate a character list with each character’s name, nicknames, and pronouns based on a particular document input and intake user feedback for a manually disambiguated list. The pipeline utilizes a human-AI collaboration approach that includes NLTK’s Named Entity Recognition (NER) and Neuralcoref’s Coreference Resolution model as well as a manual disambiguation interface. For the gender analysis web interface, we’d like to build a frontend that achieves the core functionality of the pipeline:
MVP:
Nice-to-have: