CornellNLP / ConvoKit

ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the use of the toolkit on these datasets.
https://convokit.cornell.edu/documentation/
MIT License
542 stars 120 forks source link

Add PosNegIrony transformer and demo #206

Closed danielbotros closed 9 months ago

danielbotros commented 9 months ago

Description

This change adds the PosNegIrony transformer based on the theory described in Linda Hutcheon's paper The Complex Functions of Irony and a demo using the r/Ohio and r/Cleveland subreddit corpus'. The transformer uses a sentiment-based approach, categorizing ironic utterances by the statistical deviation from the mean sentiment of the corpus. Then it applies a simple rule-based approach to categorize an ironic utterance as positive or negative irony by taking a measure of the sentiment of the comment itself and its replies.

Motivation and Context

This new transformer aims to capture and score the linguistic/literature theory of positive and negative irony. It requires the corpus to have utterances labeled with "/s" for ironic comments, so it is primarily for subreddit / Reddit corpus'.

How has this been tested?

This has been tested locally in a Jupyter Notebook using the demo code provided.

Other information

This was made in part of the requirement of A8 of INFO 4350.