An R package for the extraction of sentiment and sentiment-based plot arcs from text.
The name "Syuzhet" comes from the Russian Formalists Victor Shklovsky and Vladimir Propp who divided narrative into two components, the "fabula" and the "syuzhet." Syuzhet refers to the "device" or technique of a narrative whereas fabula is the chronological order of events. Syuzhet, therefore, is concerned with the manner in which the elements of the story (fabula) are organized (syuzhet).
The Syuzhet package attempts to reveal the latent structure of narrative by means of sentiment analysis. Instead of detecting shifts in the topic or subject matter of the narrative (as Ben Schmidt has done), the Syuzhet package reveals the emotional shifts that serve as proxies for the narrative movement between conflict and conflict resolution. This was an idea inspired by the late Kurt Vonnegut in an essay titled "Here's a Lesson in Creative Writing" in his collection A Man Without A Country ( Random House, 2007). A lecture Vonnegut gave on this subject is available via youTube
Thanks to Lincoln Mullen for early feedback on this package (see http://rpubs.com/lmullen/58030).
This package is now available on CRAN (http://cran.r-project.org/web/packages/syuzhet/).
install.packages("syuzhet")
You can install the most current development version from gitHub using the devtools
package:
# install.packages("devtools")
devtools::install_github("mjockers/syuzhet")
Syuzhet incorporates four sentiment lexicons:
The default "Syuzhet" lexicon was developed in the Nebraska Literary Lab under the direction of Matthew L. Jockers
The "afinn" lexicon was developed by Finn Arup Nielsen as the AFINN WORD DATABASE See: See http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6010 The AFINN database of words is copyright protected and distributed under "Open Database License (ODbL) v1.0" http://www.opendatacommons.org/licenses/odbl/1.0/ or a similar copyleft license.
The "bing" lexicon was developed by Minqing Hu and Bing Liu as the OPINION LEXICON See: http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html
The "nrc" lexicon was developed by Mohammad, Saif M. and Turney, Peter D. as the NRC EMOTION LEXICON.
See: http://saifmohammad.com/WebPages/lexicons.html
The NRC EMOTION LEXICON is released under the following terms of use:
Terms of use:
-- Crowdsourcing a Word-Emotion Association Lexicon, Saif Mohammad and Peter Turney, To Appear in Computational Intelligence, Wiley Blackwell Publishing Ltd.
-- Tracking Sentiment in Mail: How Genders Differ on Emotional Axes, Saif Mohammad and Tony Yang, In Proceedings of the ACL 2011 Workshop on ACL 2011 Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA), June 2011, Portland, OR. Paper (pdf)
-- From Once Upon a Time to Happily Ever After: Tracking Emotions in Novels and Fairy Tales, Saif Mohammad, In Proceedings of the ACL 2011 Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH), June 2011, Portland, OR. Paper
-- Emotions Evoked by Common Words and Phrases: Using Mechanical Turk to Create an Emotion Lexicon", Saif Mohammad and Peter Turney, In Proceedings of the NAACL-HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, June 2010, LA, California.
Links to the papers are available here: http://www.purl.org/net/NRCemotionlexicon
CONTACT INFORMATION Saif Mohammad Research Officer, National Research Council Canada email: saif.mohammad@nrc-cnrc.gc.ca phone: +1-613-993-0620