Papers in Zotero, specifically in "Alexis's Findings" folder:
8 from Morini - Kowsari et al. - Text Classification Survey - incredibly in depth
17 from Morini - Zhang et al. - Hate Speech on Twitter - Section 4 discussed their data creation, which seems similar to us with the exception that it sounds like they are seeking out certain words beforehand. A bit of an alarming part where they say "Finally, our analysis shows that the presence of abstract concepts such as ‘sexism’, ‘racism’ or ‘hate’ is very dif- ficult to detect if solely based on textual content." under their Conclusion.
from Zhang - Schmidt and Wiegand - Survey on Hate Speech Detection - A good concise discussion of a process that I feel is similar to what we may want to follow with polarization instead of hate speech. Especially Section 3 "Features" and Section 7 "Data and Annotate." Interesting observations of character n-grams being more effective that token n-grams.
from Zhang - Waseem and Hovy - Predictive Features for Hate Speech on Twitter - I'm getting good vibes as they seem to be on our path but with hate speech, there are a lot of areas that if you switch out "offensive" with "combative" (and of course the content there after) it reminds me of what we want to do. They, as mentioned in the above Survey, use character n-grams.
Add to Zotero whatever looks worth adding