quanteda / quanteda.textmodels

Text scaling and classification models for quanteda
42 stars 6 forks source link

Implement the "fightin' words" method from Monroe Colaresi and Quinn (2008) #38

Open kbenoit opened 4 years ago

kbenoit commented 4 years ago

From @matthewjdenny (Thanks Matt!)

I do have a function to do fightin words term ranking, and the funnel plots as well, although the code is a bit messy. I am attaching a tech report I wrote about this that includes some example funnel plots. Happy to try and collaborate to add this to quanteda.

The term ranking operates off of a contingency table (just aggregated term counts to category level) which I represent as a slam::simple_triplet_matrix objects (I always found these more intuitive to program with).

Code that does term ranking based off of informed Dirichlet model from Monroe et al. https://github.com/matthewjdenny/SpeedReader/blob/master/R/feature_selection.R

Function to make nice looking funnel plots: https://github.com/matthewjdenny/SpeedReader/blob/master/R/fightin_words_plot.R

Example usage: https://github.com/matthewjdenny/PPOL_628_Text_As_Data/blob/master/Scripts/term_category_associations.R